html version: Supplementary information

library(dplyr)

library(ggplot2)
library(ggforce)
library(ggpubr)
library(cowplot)
library(gridExtra)
library(colorspace)

library(gamlss)

library(kableExtra)

library(rsq)

fCO2.atm.2020 <- 411.51 * 1e-6 # oct 2020
fCO2.atm.2019 <- 408.75 * 1e-6 # oct 2019
fCO2.atm.2004 <- 374.63 * 1e-6 # oct 2004
fCO2.atm.1995 <- 360.17 * 1e-6
library(captioner)
fig_nums <- captioner() 
tab_nums <- captioner(prefix = "Table")

1 Import data

The data was prepared in Data preparation

norway <- readRDS("norway.rds")
nlss <- readRDS("nlss.rds")

norway$TOC_mg <- norway$TOC*12.011*1e3
norway$TOC_umol <- norway$TOC*1e6
norway$TA_ueq <- norway$TA * 1e6
norway$CA_ueq <- norway$CA * 1e6
norway$DIC_mg <- norway$DIC*12.011*1e3
norway$DIC_umol <- norway$DIC * 1e6
norway$CA_ueq <- norway$CA * 1e6
norway$Hplus_umol <- norway$Hplus *1e6
norway$invHplus_umol <- 1/norway$Hplus *1e6
norway$invHplus <- 1/norway$Hplus

ggplot()+
   geom_point(data = filter(norway, fCO2 > 2500*1e-6),aes(x=long, y = lat, col = survey), size = 6, shape = 1)+
  geom_point(data = filter(norway, fCO2 < 2500*1e-6),aes(x=long, y = lat, col = survey, size = fCO2*1e6))+
  borders(regions = c("Norway"))+ylim(57,63)+xlim(4,13)+
  scale_color_manual(values = c("firebrick","dodgerblue3"))+
 # scale_fill_continuous_divergingx(palette = "RdYlBu", aesthetics = c("fill","col"), mid = 30, rev = T)+ 
  labs(col="Survey", size = "pCO2, uatm")+
  theme_void()

nlss <- readRDS("nlss.rds")
nlss$invHplus <- 1/nlss$Hplus
nlss$TOC_mg <- nlss$TOC*12.011*1e3

ggplot(nlss)+geom_point(aes(x=long, y = lat, col = TOC_mg))+
  borders(regions = c("Norway","Sweden","Finland"))+ylim(55,72)+xlim(4,32)+
  scale_fill_continuous_divergingx(palette = "RdYlBu", aesthetics = c("fill","col"), mid = 10, rev = T)+
  labs(x="TOC, mg/L")+
  theme_void()

Figure 1: Map of TOC in mg/L in the Northern Lake Survey

2 Estimation of organic and carbonate alkalinity

In this section, we investigate several ways of estimating the share of organic alkalinity in our samples.

2.1 Carbonate speciation N112 / CBA

The concentration of the carbonate species in the N112 and CBA survey is calculated directly from the measured partial pressure of \(CO_2\) (Appelo and Postma 2005):

\(pCO_2 = [H_2CO_3]/K_0\)

\([HCO3^-] = K_1 \times [H_2CO_3]/[H^+]\)

\([CO_3^{2-}] = K_2 \times [HCO_3 ^- ] / [H ^+]\)

\([TIC] = [H_2CO_3] + [HCO_3^-]+[CO_3 ^{2-}]\)

\(K_0\) is Henry’s constant from Weiss (1974). \(K_1\) and \(K_2\) are ionization constants of \(H_2CO_3\) and \(HCO_3^-\) , respectively obtained from Herbert S. Harned and Davis (1943) and Herbert S. Harned and Scholes (1941) and are adjusted for temperature.

In freshwater samples, the chemical activity of carbonate species is assumed to be equal to the molar concentration. Therefore, molar concentration is used in all equilibrium equations.

Liu, Butman, and Raymond (2020) reported that in water with low ionic strength, the pH is underestimated by 0.1 to 0.5 units.  We chose to not correct pH in this study because the ionic strength was not available for the N112 survey. Since only one lake in the Northern Lakes Survey had a pH inferior to 5, we assumed that the measurement errors would be minimal.

ggplot(norway, aes(x = pH))+
  geom_point(aes(y = H2CO3, col = "H2CO3"), alpha = 0.5)+
  geom_point(aes(y = HCO3, col = "HCO3"), alpha = 0.5)+
  geom_point(aes(y = CO3, col = "CO3"), alpha = 0.5)+
  labs(y = "Carbonate species (mol/L)", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3","orange"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

Figure 2: Concentration of carbonate species in mol/L in the CBA and N112 surveys

ggplot(norway,aes(x= pH))+geom_point(aes(y = ta_fCO2, col = "pCO2 from TA"), alpha = 0.5)+
  geom_point(aes(y = fCO2, col = "Measured pCO2"), alpha = 0.5)+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_bw(base_size = 15)+
  theme(legend.position = "bottom")

Figure 3: Measured (red) and calculated (blue) pCO2 in atm for the CBA and N112 surveys

toc_boxplot <- ggplot(norway)+
  geom_boxplot(aes(x = "TOC", y = TOC, col = survey))+
  geom_boxplot(aes(x = "DIC", y = DIC, col = survey))+
  labs(x = "", y = "concentration, mol/L")+
  scale_color_manual(values = c("firebrick", "dodgerblue"))+
  theme_minimal()+
  theme(legend.position = "bottom")

toc_boxplot

Figure 4: Boxplot of DIC and TOC in mol/L for the CBA survey (red) and N112 survey (blue)

2.2 Organic alkalinity as OA = TA - CA

A simple method to estimate the organic alkalinity would be to substract the carbonate alkalinity, known thanks to the DIC measurement, to the total alkalinity. In a second step, one can investigate the link between the TOC concentration and this calculated organic alkalinity. This method is used in Couturier et al. (2022).

norway$OA <- with(norway, TA-CA)

ggarrange(
ggplot(norway)+geom_point(aes(x = pH, y = CA))+theme_bw(),
ggplot(norway)+geom_point(aes(x = pH, y = OA))+ylim(-1e-4, 1e-4)+theme_bw(),
ggplot(norway)+geom_point(aes(x = TOC, y = OA))+theme_bw(),
ggplot(norway)+geom_point(aes(x = TA, y = OA))+theme_bw(), nrow = 2, ncol = 2)

Figure 5: OA estimated as TA - CA, in eq/L

In our samples, there is no link between organic alkalinity calculated that way and the concentration of TOC. This might result from other sources of alkalinity in our samples.

2.3 Organic alkalinity according to Hruška et al. (2003)

Hruška et al. (2003) suggested that the organic alkalinity (OA) can be computed as a triprotic model. OA is the defined as the concentration of organic bases \([RCOO^-]\) in the solution, minus the concentration of organic bases at pH = 4.5, as for an end-point titration. \([RCOO^-]\) depends on three \(pK_a\) that were measured in Hruška et al. (2003):

\[[RCOO^-]=\alpha_1 \times \frac{m \times [TOC]}{3}+ \alpha_2 \times \frac{m \times [TOC]}{3} + \alpha_3 \times \frac{m \times [TOC]}{3}\]

Where α is the ionization fraction of each acid/base couple:

\[\alpha = \frac{[RCOO^-]}{[RCOOH]}=\frac{K_a}{[H^+} = \frac{10^{-pK_a}}{[H^+]}\]

And \(m\) is the site density of carboxylic group on the organic matter:

\[m = 10.2 \; \pm \; 0.6 \; \mu eq/mg \; DOC \Leftrightarrow 0.12 \; eq/mol \; DOC\]

\(m\) is supposed to be equally divided between the three acids with their respective \(pK_a\).

pKa1 <- 3.04
Ka1 <- 10^(-pKa1)

pKa2 <- 4.42
Ka2 <- 10^(-pKa2)

pKa3 <- 6.46
Ka3 <- 10^(-pKa3)

m <- 10.2*1e-6*12.011*1e3*1e-3# ueq/mg to eq/mol, site density BUT THERE IS A 1e3 PB: have to multiply by 1e-3 to stay in the range
norway$RCOO <- with(norway, (Ka1/Hplus + 2*Ka2/Hplus + 3*Ka3/Hplus)* m/3 * TOC) # estimated in eq/L

norway$RCOO_4.5 <- with(norway, (Ka1/10^(-4.5) + 2*Ka2/10^(-4.5) + 3*Ka3/10^(-4.5))* m/3 * TOC)

norway$OA_hruska <- with(norway, RCOO-RCOO_4.5)

ggplot(norway)+
  geom_point(aes(x = pH, y = RCOO, col = survey))+
  scale_color_manual(values = c("firebrick","dodgerblue3"))+
  labs(y="[RCOO-], mol/L")+
  theme_minimal()+
  theme(legend.position = "bottom")

norway$CA_hruska <- with(norway, TA - OA_hruska - OH + Hplus) 

norway$CO3_hruska <- with(norway,CA_hruska / (Hplus/K2 +2)) # in mol/L
norway$HCO3_hruska <- with(norway, CO3_hruska*Hplus/K2)
norway$H2CO3_hruska <- with(norway, HCO3_hruska*Hplus/K1)
norway$fCO2_hruska <- with(norway,H2CO3_hruska/K0)

Figure 6: OA estimated with triprotic method of Hruska et al., in eq/L

ggplot(norway,aes(x= pH))+geom_point(aes(y = fCO2_hruska, col = "pCO2 from calculated from Hruska et al"), alpha = 0.5)+
  geom_point(aes(y = fCO2, col = "Measured pCO2"), alpha = 0.5)+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal(base_size = 15)+
  theme(legend.position = "bottom")

Figure 7: pCO2 estimated with Hruska method, in atm

This method results in an overestimation of organic alkalinity in samples with high concentration of total organic carbon. The calculated carbonate alkalinity, and in turn the partial pressure of CO2, is then negative, which is impossible.

2.4 Organic alkalinity according to Liu, Butman, and Raymond (2020)

A simplifying approach was suggested by Liu, Butman, and Raymond (2020) , building on Cai, Wang, and Hodson (1998) and Lozovik (2005), where the organic alkalinity can be calculated directly from TOC as:

\(OA = 80 \% \times 10\% \times [TOC]\)

80% being the proportion of DOC that is an organic acid [@Cai1998], and 10% the maximum fraction of anions contributing to alkalinity (Lozovik 2005). This last value was calculated for DOC in humic lakes, but overall this equation is designed to be applicable to all kinds of lakes, tropical or temperate.

#norway$pH_Liu <- with(norway, pH - (0.03+0.05*log10(IS))) 
#norway$Hplus_Liu <- 10^(-norway$pH_Liu) 
#norway$OH_Liu <- with(norway,Kw*Hplus_Liu)

norway$OA_Liu <- 0.08 * norway$TOC
norway$CA_Liu <- with(norway, TA-OA_Liu - OH + Hplus)

norway$CO3_Liu <- with(norway,CA_Liu / (Hplus/K2 +2)) # in mol/L
norway$HCO3_Liu <- with(norway, CO3_Liu*Hplus/K2)
norway$H2CO3_Liu <- with(norway, HCO3_Liu*Hplus/K1)
norway$fCO2_Liu <- with(norway,H2CO3_Liu/K0)

norway$DIC_Liu <- with(norway, H2CO3_Liu + HCO3_Liu + CO3_Liu)
ggplot(norway,aes(x= pH))+geom_point(aes(y = fCO2_Liu, col = "pCO2 from calculated from Liu et al."), alpha = 0.5)+
  geom_point(aes(y = fCO2, col = "Measured pCO2"), alpha = 0.5)+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal(base_size = 15)+
  theme(legend.position = "bottom")

Figure 8: pCO2 estimated with Liu method, in atm

Once again, the organic alkalinity is overestimated in samples with high concentration of TOC, resulting in negative pCO2.

3 Model DIC and pCO2

Since the different methods to adjust the total alkalinity by removing the share of organic alkalinity, we tried different models aiming at estimating either the concentration of dissolved inorganic carbon (DIC), in order to calculate pCO2 with the carbonate equilibrium, or directly the partial pressure of CO2.

For each set of predictors, we fitted a model for DIC, then calculated the resulting pCO2; and fitted a model for pCO2.

The models used here are generalized linear model, allowing the use of the Gamma distribution.

We added a fixed intercept in the pCO2 models corresponding to the atmospheric concentration of CO2.

3.1 TA

dic.ta <- glm(formula = DIC~TA, data = norway, family = Gamma(link = "identity"))

dic.ta.sum <- summary(dic.ta)$coefficients %>% as.data.frame() 
dic.ta.sp <- rownames(dic.ta.sum)[which(dic.ta.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.ta.sum <- dic.ta.sum %>% format(scientific = T, digits = 2)

kable(dic.ta.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.1e-05 3.3e-06 1.2e+01 6.9e-26
TA 1.3e+00 9.7e-02 1.3e+01 3.8e-27
norway$DIC_pred <- fitted(dic.ta)
norway$DIC_pred_res <- residuals(dic.ta)
dic.ta.r2 <- with(summary(dic.ta), 1 - deviance/null.deviance) %>% round(2)
dic.ta.aic <- AIC(dic.ta)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.ta.r2, ""))+
    theme_minimal(base_size = 15)

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
   coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
  theme_minimal(base_size = 15)

norway$pCO2_calc <- with(norway, DIC_pred/(K0*(1+K1*Hplus+K2*K1*Hplus^2)))

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal(base_size = 15)+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 9: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TA

nlss$dic_pred <- predict(dic.ta, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from modelled DIC"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 10: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.ta <- glm(formula = fCO2~TA+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.ta.sum <- summary(pco2.ta)$coefficients %>% as.data.frame() 
pco2.ta.sp <- rownames(pco2.ta.sum)[which(pco2.ta.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.ta.sum <- pco2.ta.sum %>% format(scientific = T, digits = 2)

kable(pco2.ta.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TA 6.3e+00 9.8e-01 6.5e+00 1e-09
norway$pCO2_pred <- fitted(pco2.ta)
norway$pCO2_pred_res <- residuals(pco2.ta)
pco2.ta.r2 <- with(summary(pco2.ta), 1 - deviance/null.deviance) %>% round(2)
pco2.ta.aic <- AIC(pco2.ta)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   
  labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.ta.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 11: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) predicted (blue) pCO2 vs pH for pCO2~TA

nlss$pCO2_pred <- predict(pco2.ta, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "predicted pCO2 NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 12: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.2 1/TA

norway$invTA <- with(norway, ifelse(TA == 0, 0, 1/TA))

dic.invTA <- glm(formula = DIC~invTA, data = norway, family = Gamma(link = "identity"))

dic.invTA.sum <- summary(dic.invTA)$coefficients %>% as.data.frame() 
dic.invTA.sp <- rownames(dic.invTA.sum)[which(dic.invTA.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.invTA.sum <- dic.invTA.sum %>% format(scientific = T, digits = 2)

kable(dic.invTA.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.1e-04 2.8e-05 7.7e+00 1.2e-12
invTA -3.5e-10 8.1e-11 -4.3e+00 2.9e-05
norway$DIC_pred <- fitted(dic.invTA)
norway$DIC_pred_res <- residuals(dic.invTA)
dic.invTA.r2 <- with(summary(dic.invTA), 1 - deviance/null.deviance) %>% round(2)
dic.invTA.aic <- AIC(dic.invTA)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.invTA.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
   coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
  theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0*(1+K1*Hplus+K2*K1*Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2, rel_heights = c(2,3))

Figure 13: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~1/TA

nlss$invTA <- with(nlss, ifelse(TA == 0, 0, 1/TA))

nlss$dic_pred <- predict(dic.invTA, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from modelled DIC"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 14: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.invTA <- glm(formula = fCO2~invTA+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.invTA.sum <- summary(pco2.invTA)$coefficients %>% as.data.frame() 
pco2.invTA.sp <- rownames(pco2.invTA.sum)[which(pco2.invTA.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.invTA.sum <- pco2.invTA.sum %>% format(scientific = T, digits = 2)

kable(pco2.invTA.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
invTA 2.3e-08 7.4e-09 3.1e+00 2.6e-03
norway$pCO2_pred <- fitted(pco2.invTA)
norway$pCO2_pred_res <- residuals(pco2.invTA)
pco2.invTA.r2 <- with(summary(pco2.invTA), 1 - deviance/null.deviance) %>% round(2)
pco2.invTA.aic <- AIC(pco2.invTA)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   
  labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.invTA.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 15: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) and predicted (blue) pCO2 vs pH for pCO2~1/TA

nlss$pCO2_pred <- predict(pco2.invTA, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "predicted pCO2 NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 16: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.3 TOC

dic.toc <- glm(formula = DIC~TOC, data = norway, family = Gamma(link = "identity"))
dic.toc.sum <- summary(dic.toc)$coefficients %>% as.data.frame() 
dic.toc.sp <- rownames(dic.toc.sum)[which(dic.toc.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.toc.sum <- dic.toc.sum %>% format(scientific = T, digits = 2)

kable(dic.toc.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.0e-05 7.8e-06 2.6e+00 1.1e-02
TOC 2.8e-01 3.7e-02 7.6e+00 1.8e-12
norway$DIC_pred <- fitted(dic.toc)
norway$DIC_pred_res <- residuals(dic.toc)
dic.toc.r2 <- with(summary(dic.toc), 1 - deviance/null.deviance) %>% round(2)
dic.toc.aic <- AIC(dic.toc)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.toc.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 17: Predicted vs fitted values, Standardized residuals vs fitted values, and measured (red) calculated (blue) pCO2 vs pH for DIC~TOC

nlss$dic_pred <- predict(dic.toc, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 18: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.toc <- glm(formula = fCO2~TOC+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.toc.sum <- summary(pco2.toc)$coefficients %>% as.data.frame() 
pco2.toc.sp <- rownames(pco2.toc.sum)[which(pco2.toc.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.toc.sum <- pco2.toc.sum %>% format(scientific = T, digits = 2)

kable(pco2.toc.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC 1.1e+00 6.4e-02 1.6e+01 5e-37
norway$pCO2_pred <- fitted(pco2.toc)
norway$pCO2_pred_res <- residuals(pco2.toc)
pco2.toc.r2 <- with(summary(pco2.toc), 1 - deviance/null.deviance) %>% round(2)
pco2.toc.aic <- AIC(pco2.toc)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.toc.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 19: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) and predicted (blue), pCO2 vs pH for DIC~TOC

nlss$pCO2_pred <- predict(pco2.toc, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 20: Predicted pCO2 in atm vs pH for the Northern Lake Survey

ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"), alpha = 0.5)+
  geom_point(aes(y = fCO2, col = "Measured pCO2"), alpha = 0.5)+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_bw(base_size = 15)+
  theme(legend.position = "bottom")

Figure 21: Model results for pCO2~TOC

3.4 TOC + temp

dic.toc.temp <- glm(formula = DIC~TOC+temp_k, data = norway, family = Gamma(link = "identity"))
dic.toc.temp.sum <- summary(dic.toc.temp)$coefficients %>% as.data.frame() 
dic.toc.temp.sp <- rownames(dic.toc.temp.sum)[which(dic.toc.temp.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.toc.temp.sum <- dic.toc.temp.sum %>% format(scientific = T, digits = 2)

kable(dic.toc.temp.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.4e-03 8.4e-04 1.6e+00 1.1e-01
TOC 3.0e-01 3.9e-02 7.6e+00 1.7e-12
temp_k -4.8e-06 3.0e-06 -1.6e+00 1.1e-01
norway$DIC_pred <- fitted(dic.toc.temp)
norway$DIC_pred_res <- residuals(dic.toc.temp)
dic.toc.temp.r2 <- with(summary(dic.toc.temp), 1 - deviance/null.deviance) %>% round(2)
dic.toc.temp.aic <- AIC(dic.toc.temp)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.toc.temp.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 17: Predicted vs fitted values, Standardized residuals vs fitted values, and measured (red) calculated (blue) pCO2 vs pH for DIC~TOC

nlss$dic_pred <- predict(dic.toc.temp, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 18: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.toc.temp <- glm(formula = fCO2~TOC+temp_k+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.toc.temp.sum <- summary(pco2.toc.temp)$coefficients %>% as.data.frame() 
pco2.toc.temp.sp <- rownames(pco2.toc.temp.sum)[which(pco2.toc.temp.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.toc.temp.sum <- pco2.toc.temp.sum %>% format(scientific = T, digits = 2)

kable(pco2.toc.temp.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC 9.4e-01 8.3e-02 1.1e+01 1.3e-22
temp_k 2.4e-07 1.3e-07 1.9e+00 6.4e-02
norway$pCO2_pred <- fitted(pco2.toc.temp)
norway$pCO2_pred_res <- residuals(pco2.toc.temp)
pco2.toc.temp.r2 <- with(summary(pco2.toc.temp), 1 - deviance/null.deviance) %>% round(2)
pco2.toc.temp.aic <- AIC(pco2.toc.temp)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.toc.temp.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 19: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) and predicted (blue), pCO2 vs pH for DIC~TOC

nlss$pCO2_pred <- predict(pco2.toc.temp, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 20: Predicted pCO2 in atm vs pH for the Northern Lake Survey

ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"), alpha = 0.5)+
  geom_point(aes(y = fCO2, col = "Measured pCO2"), alpha = 0.5)+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_bw(base_size = 15)+
  theme(legend.position = "bottom")

Figure 21: Model results for pCO2~TOC

3.5 TA + TOC

dic.toc.ta <- glm(formula = DIC~TOC+TA, data = norway, family = Gamma(link = "identity"))
dic.toc.ta.sum <- summary(dic.toc.ta)$coefficients %>% as.data.frame() 
dic.toc.ta.sp <- rownames(dic.toc.ta.sum)[which(dic.toc.ta.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.toc.ta.sum <- dic.toc.ta.sum %>% format(scientific = T, digits = 2)

kable(dic.toc.ta.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.0e-05 3.1e-06 9.6e+00 1.3e-17
TOC 3.2e-02 8.9e-03 3.6e+00 3.9e-04
TA 1.2e+00 9.6e-02 1.2e+01 5.9e-25
norway$DIC_pred <- fitted(dic.toc.ta)
norway$DIC_pred_res <- residuals(dic.toc.ta)
dic.toc.ta.r2 <- with(summary(dic.toc.ta), 1 - deviance/null.deviance) %>% round(2)
dic.toc.ta.aic <- AIC(dic.toc.ta)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.toc.ta.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 22: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TA+TOC

nlss$dic_pred <- predict(dic.toc.ta, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 23: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.toc.ta <- glm(formula = fCO2~TOC+TA+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.toc.ta.sum <- summary(pco2.toc.ta)$coefficients %>% as.data.frame() 
pco2.toc.ta.sp <- rownames(pco2.toc.ta.sum)[which(pco2.toc.ta.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.toc.ta.sum <- pco2.toc.ta.sum %>% format(scientific = T, digits = 2)

kable(pco2.toc.ta.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC 9.0e-01 7.9e-02 1.1e+01 9.3e-23
TA 8.3e-01 3.7e-01 2.3e+00 2.4e-02
norway$pCO2_pred <- fitted(pco2.toc.ta)
norway$pCO2_pred_res <- residuals(pco2.toc.ta)
pco2.toc.ta.r2 <- with(summary(pco2.toc.ta), 1 - deviance/null.deviance) %>% round(2)
pco2.toc.ta.aic <- AIC(pco2.toc.ta)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.toc.ta.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 24: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for pCO2~TOC+TA `

nlss$pCO2_pred <- predict(pco2.toc.ta, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 25: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.6 TA*TOC + Hplus

dic.tocta.Hplus <- glm(formula = DIC~TOC+TA+Hplus, data = norway, family = Gamma(link = "identity"))
  dic.tocta.Hplus.sum <- summary(dic.tocta.Hplus)$coefficients %>% as.data.frame() 
  dic.tocta.Hplus.sp <- rownames(dic.tocta.Hplus.sum)[which(dic.tocta.Hplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
  dic.tocta.Hplus.sum <- dic.tocta.Hplus.sum %>% format(scientific = T, digits = 2)

kable(dic.tocta.Hplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.9e-05 3.8e-06 5.1e+00 8.0e-07
TOC 5.2e-03 7.5e-03 6.9e-01 4.9e-01
TA 1.4e+00 1.1e-01 1.3e+01 9.2e-28
Hplus 3.7e+00 8.4e-01 4.3e+00 2.4e-05
norway$DIC_pred <- fitted(dic.tocta.Hplus)
norway$DIC_pred_res <- residuals(dic.tocta.Hplus)
dic.tocta.Hplus.r2 <- with(summary(dic.tocta.Hplus), 1 - deviance/null.deviance) %>% round(2)
dic.tocta.Hplus.aic <- AIC(dic.tocta.Hplus)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.tocta.Hplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 26: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TA x TOC + Hplus

nlss$dic_pred <- predict(dic.tocta.Hplus, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 27: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.tocta.Hplus <- glm(formula = fCO2~TOC+TA+Hplus+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.tocta.Hplus.sum <- summary(pco2.tocta.Hplus)$coefficients %>% as.data.frame() 
pco2.tocta.Hplus.sp <- rownames(pco2.tocta.Hplus.sum)[which(pco2.tocta.Hplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.tocta.Hplus.sum <- pco2.tocta.Hplus.sum %>% format(scientific = T, digits = 2)

kable(pco2.tocta.Hplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC 6.7e-01 8.7e-02 7.7e+00 8.8e-13
TA 1.5e+00 4.2e-01 3.6e+00 4.0e-04
Hplus 2.9e+01 8.5e+00 3.4e+00 7.6e-04
norway$pCO2_pred <- fitted(pco2.tocta.Hplus)
norway$pCO2_pred_res <- residuals(pco2.tocta.Hplus)
pco2.tocta.Hplus.r2 <- with(summary(pco2.tocta.Hplus), 1 - deviance/null.deviance) %>% round(2)
pco2.tocta.Hplus.aic <- AIC(pco2.tocta.Hplus)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.tocta.Hplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 28: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for pCO2~TA x TOC + Hplus

nlss$pCO2_pred <- predict(pco2.tocta.Hplus, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 29: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.7 TOC + pH

dic.toc.Hplus <- glm(formula = DIC~TOC+Hplus, data = norway, family = Gamma(link = "identity"))
dic.toc.Hplus.sum <- summary(dic.toc.Hplus)$coefficients %>% as.data.frame() 
dic.toc.Hplus.sp <- rownames(dic.toc.Hplus.sum)[which(dic.toc.Hplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.toc.Hplus.sum <- dic.toc.Hplus.sum %>% format(scientific = T, digits = 2)

kable(dic.toc.Hplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.5e-05 1.0e-05 3.4e+00 7.9e-04
TOC 2.6e-01 3.6e-02 7.5e+00 4.3e-12
Hplus -4.4e+00 1.6e+00 -2.7e+00 7.8e-03
norway$DIC_pred <- fitted(dic.toc.Hplus)
norway$DIC_pred_res <- residuals(dic.toc.Hplus)
dic.toc.Hplus.r2 <- with(summary(dic.toc.Hplus), 1 - deviance/null.deviance) %>% round(2)
dic.toc.Hplus.aic <- AIC(dic.toc.Hplus)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.toc.Hplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "Hplus", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 30: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TOC+Hplus

nlss$dic_pred <- predict(dic.toc.Hplus, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 31: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.toc.Hplus <- glm(formula = fCO2~TOC+Hplus+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.toc.Hplus.sum <- summary(pco2.toc.Hplus)$coefficients %>% as.data.frame() 
pco2.toc.Hplus.sp <- rownames(pco2.toc.Hplus.sum)[which(pco2.toc.Hplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.toc.Hplus.sum <- pco2.toc.Hplus.sum %>% format(scientific = T, digits = 2)

kable(pco2.toc.Hplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC 9.9e-01 6.8e-02 1.5e+01 1.2e-31
Hplus 1.5e+01 8.0e+00 1.9e+00 5.4e-02
norway$pCO2_pred <- fitted(pco2.toc.Hplus)
norway$pCO2_pred_res <- residuals(pco2.toc.Hplus)
pco2.toc.Hplus.r2 <- with(summary(pco2.toc.Hplus), 1 - deviance/null.deviance) %>% round(2)
pCO2.toc.Hplus.aic <- AIC(pco2.toc.Hplus)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.toc.Hplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 30: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TOC+Hplus

nlss$pCO2_pred <- predict(pco2.toc.Hplus, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 32: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.8 TA * TOC

dic.tocta <- glm(formula = DIC~TOC*TA, data = norway, family = Gamma(link = "identity"))
dic.tocta.sum <- summary(dic.tocta)$coefficients %>% as.data.frame() 
dic.tocta.sp <- rownames(dic.tocta.sum)[which(dic.toc.ta.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.tocta.sum <- dic.tocta.sum %>% format(scientific = T, digits = 2)

kable(dic.tocta.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.0e-05 3.5e-06 8.6e+00 6.5e-15
TOC 3.2e-02 1.0e-02 3.1e+00 2.1e-03
TA 1.2e+00 1.4e-01 8.1e+00 1.2e-13
TOC:TA 4.0e+00 1.4e+02 2.9e-02 9.8e-01
norway$DIC_pred <- fitted(dic.tocta)
norway$DIC_pred_res <- residuals(dic.tocta)
dic.tocta.r2 <- with(summary(dic.tocta), 1 - deviance/null.deviance) %>% round(2)
dic.tocta.aic <- AIC(dic.tocta)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.tocta.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 33: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TA x TOC

nlss$dic_pred <- predict(dic.tocta, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 34: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.tocta <- glm(formula = fCO2~TOC+TA+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.tocta.sum <- summary(pco2.tocta)$coefficients %>% as.data.frame() 
pco2.tocta.sp <- rownames(pco2.tocta.sum)[which(pco2.tocta.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.tocta.sum <- pco2.tocta.sum %>% format(scientific = T, digits = 2)

kable(pco2.tocta.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC 9.0e-01 7.9e-02 1.1e+01 9.3e-23
TA 8.3e-01 3.7e-01 2.3e+00 2.4e-02
norway$pCO2_pred <- fitted(pco2.tocta)
norway$pCO2_pred_res <- residuals(pco2.tocta)
pco2.tocta.r2 <- with(summary(pco2.tocta), 1 - deviance/null.deviance) %>% round(2)
pco2.tocta.aic <- AIC(pco2.tocta)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.tocta.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 35: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for pCO2~TA x TOC

nlss$pCO2_pred <- predict(pco2.tocta, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 36: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.9 TA*Hplus

dic.taHplus <- glm(formula = DIC~TA*Hplus, data = norway, family = Gamma(link = "identity"))

dic.taHplus.sum <- summary(dic.taHplus)$coefficients %>% as.data.frame() 
dic.taHplus.sp <- rownames(dic.taHplus.sum)[which(dic.taHplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.taHplus.sum <- dic.taHplus.sum %>% format(scientific = T, digits = 2)

kable(dic.taHplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.8e-05 3.9e-06 4.7e+00 4.8e-06
TA 1.4e+00 9.3e-02 1.5e+01 6.3e-33
Hplus 4.0e+00 7.8e-01 5.1e+00 8.7e-07
TA:Hplus 1.1e+05 1.1e+05 1.0e+00 3.0e-01
norway$DIC_pred <- fitted(dic.taHplus)
norway$DIC_pred_res <- residuals(dic.taHplus)
dic.taHplus.r2 <- with(summary(dic.taHplus), 1 - deviance/null.deviance) %>% round(2)
dic.taHplus.aic <- AIC(dic.taHplus)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.taHplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 37: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TA x Hplus

nlss$dic_pred <- predict(dic.taHplus, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 38: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.taHplus <- glm(formula = fCO2~TA*Hplus+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.taHplus.sum <- summary(pco2.taHplus)$coefficients %>% as.data.frame() 
pco2.taHplus.sp <- rownames(pco2.taHplus.sum)[which(pco2.taHplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.taHplus.sum <- pco2.taHplus.sum %>% format(scientific = T, digits = 2)

kable(pco2.taHplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TA 2.7e+00 3.3e-01 8.0e+00 1.7e-13
Hplus 5.2e+01 7.1e+00 7.3e+00 8.2e-12
TA:Hplus 9.7e+06 1.3e+06 7.4e+00 7.2e-12
norway$pCO2_pred <- fitted(pco2.taHplus)
norway$pCO2_pred_res <- residuals(pco2.taHplus)
pco2.taHplus.r2 <- with(summary(pco2.taHplus), 1 - deviance/null.deviance) %>% round(2)
pco2.taHplus.aic <- AIC(pco2.taHplus)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.taHplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 39: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TA

nlss$pCO2_pred <- predict(pco2.taHplus, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 40: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.10 TOC*Hplus

dic.TOCHplus <- glm(formula = DIC~TOC*Hplus, data = norway, family = Gamma(link = "identity"))

dic.TOCHplus.sum <- summary(dic.TOCHplus)$coefficients %>% as.data.frame() 
dic.TOCHplus.sp <- rownames(dic.TOCHplus.sum)[which(dic.TOCHplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.TOCHplus.sum <- dic.TOCHplus.sum %>% format(scientific = T, digits = 2)

kable(dic.TOCHplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.5e-05 1.0e-05 2.4e+00 1.6e-02
TOC 2.9e-01 3.8e-02 7.4e+00 4.6e-12
Hplus -1.6e+00 2.4e+00 -6.4e-01 5.3e-01
TOC:Hplus -7.2e+03 4.0e+03 -1.8e+00 7.2e-02
norway$DIC_pred <- fitted(dic.TOCHplus)
norway$DIC_pred_res <- residuals(dic.TOCHplus)
dic.TOCHplus.r2 <- with(summary(dic.TOCHplus), 1 - deviance/null.deviance) %>% round(2)
dic.TOCHplus.aic <- AIC(dic.TOCHplus)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.TOCHplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 41: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TOC x Hplus

nlss$dic_pred <- predict(dic.TOCHplus, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 42: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.TOCHplus <- glm(formula = fCO2~TOC*Hplus+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.TOCHplus.sum <- summary(pco2.TOCHplus)$coefficients %>% as.data.frame() 
pco2.TOCHplus.sp <- rownames(pco2.TOCHplus.sum)[which(pco2.TOCHplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.TOCHplus.sum <- pco2.TOCHplus.sum %>% format(scientific = T, digits = 2)

kable(pco2.TOCHplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC 9.9e-01 7.1e-02 1.4e+01 3.5e-30
Hplus 1.6e+01 1.2e+01 1.4e+00 1.6e-01
TOC:Hplus -2.9e+03 2.5e+04 -1.1e-01 9.1e-01
norway$pCO2_pred <- fitted(pco2.TOCHplus)
norway$pCO2_pred_res <- residuals(pco2.TOCHplus)
pco2.TOCHplus.r2 <- with(summary(pco2.TOCHplus), 1 - deviance/null.deviance) %>% round(2)
pco2.TOCHplus.aic <- AIC(pco2.TOCHplus)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.TOCHplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 41: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TOC x Hplus

nlss$pCO2_pred <- predict(pco2.TOCHplus, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 42: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.11 TA + TOC*Hplus

dic.tocHplus.ta <- glm(formula = DIC~TOC*Hplus+TA, data = norway, family = Gamma(link = "identity"))
dic.tocHplus.ta.sum <- summary(dic.tocHplus.ta)$coefficients %>% as.data.frame() 
dic.tocHplus.ta.sp <- rownames(dic.tocHplus.ta.sum)[which(dic.tocHplus.ta.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.tocHplus.ta.sum <- dic.tocHplus.ta.sum %>% format(scientific = T, digits = 2)

kable(dic.tocHplus.ta.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.5e-05 4.3e-06 5.8e+00 2.8e-08
TOC -6.0e-03 5.9e-03 -1.0e+00 3.1e-01
Hplus 1.4e+00 1.2e+00 1.2e+00 2.4e-01
TA 1.4e+00 1.0e-01 1.4e+01 4.1e-30
TOC:Hplus 4.5e+03 2.0e+03 2.3e+00 2.4e-02
norway$DIC_pred <- fitted(dic.tocHplus.ta)
norway$DIC_pred_res <- residuals(dic.tocHplus.ta)
dic.tocHplus.ta.r2 <- with(summary(dic.tocHplus.ta), 1 - deviance/null.deviance) %>% round(2)
dic.tocHplus.ta.aic <- AIC(dic.tocHplus.ta)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.tocHplus.ta.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 43: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC ~ TA + TOC x Hplus

nlss$dic_pred <- predict(dic.tocHplus.ta, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 44: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.tocHplus.ta <- glm(formula = fCO2~TOC*Hplus+TA+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.tocHplus.ta.sum <- summary(pco2.tocHplus.ta)$coefficients %>% as.data.frame() 
pco2.tocHplus.ta.sp <- rownames(pco2.tocHplus.ta.sum)[which(pco2.tocHplus.ta.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.tocHplus.ta.sum <- pco2.tocHplus.ta.sum %>% format(scientific = T, digits = 2)

kable(pco2.tocHplus.ta.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC 5.6e-01 9.1e-02 6.2e+00 5.3e-09
Hplus 1.7e+01 1.2e+01 1.5e+00 1.3e-01
TA 1.9e+00 4.5e-01 4.2e+00 3.6e-05
TOC:Hplus 5.0e+04 2.9e+04 1.8e+00 7.9e-02
norway$pCO2_pred <- fitted(pco2.tocHplus.ta)
norway$pCO2_pred_res <- residuals(pco2.tocHplus.ta)
pco2.tocHplus.ta.r2 <- with(summary(pco2.tocHplus.ta), 1- deviance/null.deviance) %>% round(2)
pco2.tocHplus.ta.aic <- AIC(pco2.tocHplus.ta)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.tocHplus.ta.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 43: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC ~ TA + TOC x Hplus

nlss$pCO2_pred <- predict(pco2.tocHplus.ta, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 44: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.12 TOC + TA*Hplus

dic.toc.taHplus <- glm(formula = DIC~TOC+TA*Hplus, data = norway, family = Gamma(link = "identity"))

dic.toc.taHplus.sum <- summary(dic.toc.taHplus)$coefficients %>% as.data.frame() 
dic.toc.taHplus.sp <- rownames(dic.toc.taHplus.sum)[which(dic.toc.taHplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.toc.taHplus.sum <- dic.toc.taHplus.sum %>% format(scientific = T, digits = 2)
kable(dic.toc.taHplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.8e-05 3.9e-06 4.7e+00 5.7e-06
TOC -4.0e-04 6.9e-03 -5.8e-02 9.5e-01
TA 1.4e+00 1.0e-01 1.3e+01 1.8e-28
Hplus 4.0e+00 8.4e-01 4.8e+00 3.9e-06
TA:Hplus 1.1e+05 1.1e+05 1.0e+00 3.2e-01
norway$DIC_pred <- fitted(dic.toc.taHplus)
norway$DIC_pred_res <- residuals(dic.toc.taHplus)
dic.toc.taHplus.r2 <- with(summary(dic.toc.taHplus), 1 - deviance/null.deviance) %>% round(2)
dic.toc.taHplus.aic <- AIC(dic.toc.taHplus)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.toc.taHplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 9: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TA

nlss$dic_pred <- predict(dic.toc.taHplus, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 45: Predicted pCO2 in atm vs pH for the Northern Lake Survey

pco2.toc.taHplus <- glm(formula = fCO2~TOC+TA*Hplus+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.toc.taHplus.sum <- summary(pco2.toc.taHplus)$coefficients %>% as.data.frame()
pco2.toc.taHplus.sp <- rownames(pco2.toc.taHplus.sum)[which(pco2.toc.taHplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.toc.taHplus.sum <- pco2.toc.taHplus.sum %>% format(scientific = T, digits = 2)

kable(pco2.toc.taHplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC -1.4e-02 2.7e-02 -5.3e-01 6.0e-01
TA 2.7e+00 3.5e-01 7.7e+00 1.1e-12
Hplus 5.3e+01 7.2e+00 7.4e+00 8.1e-12
TA:Hplus 9.9e+06 1.4e+06 7.3e+00 1.1e-11
norway$pCO2_pred <- fitted(pco2.toc.taHplus)
norway$pCO2_pred_res <- residuals(pco2.toc.taHplus)
pco2.toc.taHplus.r2 <- with(summary(pco2.toc.taHplus), 1 - deviance/null.deviance) %>% round(2)
pco2.toc.taHplus.aic <- AIC(pco2.toc.taHplus)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.toc.taHplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 46: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for pCO2~TOC + TA x Hplus

nlss$pCO2_pred <- predict(pco2.toc.taHplus, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 47: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.13 TOC + TA*invHplus

norway$invHplus <- with(norway, 1/Hplus)

dic.toc.tainvHplus <- glm(formula = DIC~TOC+TA*invHplus, data = norway, family = Gamma(link = "identity"))

dic.toc.tainvHplus.sum <- summary(dic.toc.tainvHplus)$coefficients %>% as.data.frame()
dic.toc.tainvHplus.sp <- rownames(dic.toc.tainvHplus.sum)[which(dic.toc.tainvHplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.toc.tainvHplus.sum <- dic.toc.tainvHplus.sum %>% format(scientific = T, digits = 2)

kable(dic.toc.tainvHplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.4e-05 2.6e-06 1.3e+01 7.4e-27
TOC 4.1e-02 7.2e-03 5.7e+00 6.1e-08
TA 5.0e-01 1.2e-01 4.2e+00 3.7e-05
invHplus -4.2e-12 1.9e-12 -2.2e+00 2.9e-02
TA:invHplus 8.2e-08 1.2e-08 6.9e+00 1.0e-10
norway$DIC_pred <- fitted(dic.toc.tainvHplus)
norway$DIC_pred_res <- residuals(dic.toc.tainvHplus)
dic.toc.tainvHplus.r2 <- with(summary(dic.toc.tainvHplus), 1 - deviance/null.deviance) %>% round(2)
dic.toc.tainvHplus.aic <- AIC(dic.toc.tainvHplus)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.toc.tainvHplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 48: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TOC + TA/hplus

nlss$dic_pred <- predict(dic.toc.tainvHplus, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 49: Predicted pCO2 in atm vs pH for the Northern Lake Survey

norway$invHplus <- with(norway, 1/Hplus)

pco2.toc.tainvHplus <- glm(formula = fCO2~TOC+TA*invHplus+0+offset(fCO2.atm), data = norway, family = Gamma(link = "identity"))

pco2.toc.tainvHplus.sum <- summary(pco2.toc.tainvHplus)$coefficients %>% as.data.frame() 
pco2.toc.tainvHplus.sp <- rownames(pco2.toc.tainvHplus.sum)[which(pco2.toc.tainvHplus.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.toc.tainvHplus.sum <- pco2.toc.tainvHplus.sum %>% format(scientific = T, digits = 2)

kable(pco2.toc.tainvHplus.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC 8.7e-01 9.0e-02 9.7e+00 6.5e-18
TA 1.9e+00 8.1e-01 2.4e+00 1.8e-02
invHplus -1.4e-11 6.9e-12 -2.1e+00 3.8e-02
TA:invHplus -1.2e-08 1.9e-08 -6.1e-01 5.4e-01
norway$pCO2_pred <- fitted(pco2.toc.tainvHplus)
norway$pCO2_pred_res <- residuals(pco2.toc.tainvHplus)
pco2.toc.tainvHplus.r2 <- with(summary(pco2.toc.tainvHplus), 1 - deviance / null.deviance) %>% round(2)
pco2.toc.tainvHplus.aic <- AIC(pco2.toc.tainvHplus)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.toc.tainvHplus.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
   coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

*Figure 9: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC~TA

nlss$pCO2_pred <- predict(pco2.toc.tainvHplus, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 50: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.14 TA + TOC*invHplus

norway$invHplus <- with(norway, 1/Hplus)

dic.tocinvHplus.ta <- glm(formula = DIC~TOC*invHplus+TA, data = norway, family = Gamma(link = "identity"))

dic.tocinvHplus.ta.sum <- summary(dic.tocinvHplus.ta)$coefficients %>% as.data.frame() 
dic.tocinvHplus.ta.sp <- rownames(dic.tocinvHplus.ta.sum)[which(dic.tocinvHplus.ta.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
dic.tocinvHplus.ta.sum <- dic.tocinvHplus.ta.sum %>% format(scientific = T, digits = 2)

kable(dic.tocinvHplus.ta.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.9e-05 3.2e-06 9.0e+00 5.9e-16
TOC 4.0e-02 9.4e-03 4.3e+00 3.3e-05
invHplus 2.9e-12 2.6e-12 1.1e+00 2.6e-01
TA 5.9e-01 1.5e-01 4.0e+00 9.9e-05
TOC:invHplus 1.1e-08 3.5e-09 3.1e+00 2.6e-03
norway$DIC_pred <- fitted(dic.tocinvHplus.ta)
norway$DIC_pred_res <- residuals(dic.tocinvHplus.ta)
dic.tocinvHplus.ta.r2 <- with(summary(dic.tocinvHplus.ta), 1 - deviance/null.deviance) %>% round(2)
dic.tocinvHplus.ta.aic <- AIC(dic.tocinvHplus.ta)

g1 <- ggplot(norway)+geom_point(aes(x=DIC,y=DIC_pred))+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",dic.tocinvHplus.ta.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=DIC_pred,y=DIC_pred_res))+   
    coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()

norway$pCO2_calc <- with(norway, DIC_pred/(K0+K0*K1/Hplus+K0*K1*K2/(Hplus^2)))


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_calc, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3+labs(caption = ),nrow = 2,rel_heights = c(2,3))

Figure 51: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for DIC ~ TOC / Hplus + TA

nlss$dic_pred <- predict(dic.tocinvHplus.ta, newdata = nlss, type = "response")
nlss$pCO2_calc <- with(nlss, dic_pred/(K0+K0*K1/Hplus+K0*K1*K2/Hplus))

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_calc, col = "pCO2 from predicted DIC, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 52: Predicted pCO2 in atm vs pH for the Northern Lake Survey

norway$invHplus <- with(norway, 1/Hplus)

pco2.tocinvHplus.ta <- glm(formula = fCO2~TOC*invHplus+TA+0+offset(fCO2.atm), data = norway, start = c(0,0,0,0), family = Gamma(link = "identity"))

pco2.tocinvHplus.ta.sum <- summary(pco2.tocinvHplus.ta)$coefficients %>% as.data.frame() 
pco2.tocinvHplus.ta.sp <- rownames(pco2.tocinvHplus.ta.sum)[which(pco2.tocinvHplus.ta.sum$`Pr(>|t|)` < 0.05)] %>% paste(collapse = ", ")
pco2.tocinvHplus.ta.sum <- pco2.tocinvHplus.ta.sum %>% format(scientific = T, digits = 2)
  

kable(pco2.tocinvHplus.ta.sum, digits = 2) %>% kable_styling(bootstrap_options = "bordered")
Estimate Std. Error t value Pr(>|t|)
TOC 8.8e-01 8.0e-02 1.1e+01 1.3e-21
invHplus -8.8e-12 8.0e-12 -1.1e+00 2.8e-01
TA 1.9e+00 6.1e-01 3.2e+00 1.5e-03
TOC:invHplus -1.5e-08 1.1e-08 -1.3e+00 1.9e-01
norway$pCO2_pred <- fitted(pco2.tocinvHplus.ta)
norway$pCO2_pred_res <- residuals(pco2.tocinvHplus.ta)
pco2.tocinvHplus.ta.r2 <- with(summary(pco2.tocinvHplus.ta), 1 - deviance/null.deviance) %>% round(2)
pco2.tocinvHplus.ta.aic <- AIC(pco2.tocinvHplus.ta)

g1 <- ggplot(norway)+geom_point(aes(x=fCO2,y=pCO2_pred))+   labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  labs(x = "Measured pCO2, atm", y = "Predicted pCO2, atm")+
  geom_abline(slope= 1, intercept = 0)+coord_trans(x="log10",y="log10")+
  annotate(x = 0.5e-3, y = 3e-3, geom = "label", label = paste("R2 = ",pco2.tocinvHplus.ta.r2, ""))+
    theme_minimal()

g2 <- ggplot(norway)+geom_point(aes(x=pCO2_pred,y=pCO2_pred_res))+
     coord_trans(x = "log10")+labs(x = "fitted", y = "Std Residuals")+
    theme_minimal()


g3 <- ggplot(norway,aes(x= pH))+geom_point(aes(y = pCO2_pred, col = "pCO2 from model"))+
  geom_point(aes(y = fCO2, col = "Measured pCO2"))+
  labs(y = "pCO2, atm", x = "pH", col = "")+
  scale_color_manual(values = c( "firebrick","dodgerblue3"))+
  facet_grid(cols = vars(survey))+
  theme_minimal()+
  theme(legend.position = "bottom")

plot_grid(ggarrange(g1,g2),g3,nrow = 2,rel_heights = c(2,3))

Figure 53: Predicted vs fitted values, Standardized residuals vs fitted values,and measured (red) calculated (blue) pCO2 vs pH for pCO2 ~ TOC / Hplus + TA

nlss$pCO2_pred <- predict(pco2.tocinvHplus.ta, newdata = nlss, type = "response")

ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_pred, col = "Predicted pCO2, NLS"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

Figure 36: Predicted pCO2 in atm vs pH for the Northern Lake Survey

3.15 Summary of all models

Table 1: Summary of DIC and pCO2 models

kable(cbind(data.frame(Predictors = c("TA",
                                      "1/TA",
                                "TOC",
                                "TA + TOC",
                                "TA*TOC",
                                "TA*Hplus",
                                "TOC*Hplus",
                                "TA*Hplus + TOC", 
                                "TA + TOC*Hplus",
                                "TA/Hplus + TOC",
                                "TA + TOC/Hplus"),
                        R2 = c(pco2.ta.r2,
                              pco2.invTA.r2,
                              pco2.toc.r2,
                              pco2.toc.ta.r2,
                              pco2.tocta.r2,
                              pco2.taHplus.r2,
                              pco2.TOCHplus.r2,
                              pco2.toc.taHplus.r2,
                              pco2.tocHplus.ta.r2,
                              pco2.toc.tainvHplus.r2,
                              pco2.tocinvHplus.ta.r2),
                       AIC = c(pco2.ta.aic,
                              pco2.invTA.aic,
                              pco2.toc.aic,
                              pco2.toc.ta.aic,
                              pco2.tocta.aic,
                              pco2.taHplus.aic,
                              pco2.TOCHplus.aic,
                              pco2.toc.taHplus.aic,
                              pco2.tocHplus.ta.aic,
                              pco2.toc.tainvHplus.aic,
                              pco2.tocinvHplus.ta.aic),
                        significant  = c(pco2.ta.sp,
                                 pco2.invTA.sp,
                                 pco2.toc.sp,
                                 pco2.toc.ta.sp,
                                 pco2.tocta.sp,
                                 pco2.taHplus.sp,
                                 pco2.TOCHplus.sp,
                                 pco2.toc.taHplus.sp,
                                 pco2.tocHplus.ta.sp,
                                 pco2.toc.tainvHplus.sp,
                                 pco2.tocinvHplus.ta.sp)),
                data.frame(R2 = c(dic.ta.r2,
                                dic.invTA.r2,
                                dic.toc.r2,
                                dic.toc.ta.r2,
                                dic.tocta.r2,
                                dic.taHplus.r2,
                                dic.TOCHplus.r2,
                                dic.toc.taHplus.r2,
                                dic.tocHplus.ta.r2,
                                dic.toc.tainvHplus.r2,
                                dic.tocinvHplus.ta.r2),
                           AIC = c(dic.ta.aic,
                                dic.invTA.aic,
                                dic.toc.aic,
                                dic.toc.ta.aic,
                                dic.tocta.aic,
                                dic.taHplus.aic,
                                dic.TOCHplus.aic,
                                dic.toc.taHplus.aic,
                                dic.tocHplus.ta.aic,
                                dic.toc.tainvHplus.aic,
                                dic.tocinvHplus.ta.aic),
                      significant  = c(dic.ta.sp,
                                dic.invTA.sp,
                                dic.toc.sp,
                                dic.toc.ta.sp,
                                dic.tocta.sp,
                                dic.taHplus.sp,
                                dic.TOCHplus.sp,
                                dic.toc.taHplus.sp,
                                dic.tocHplus.ta.sp,
                                dic.toc.tainvHplus.sp,
                                dic.tocinvHplus.ta.sp))),
             digits = 2) %>% 
  add_header_above(c(" "= 1, "pCO2" = 3, "DIC" = 3),italic = T) %>%
  column_spec(column = c(2,4), bold = T) %>% 
  kable_styling(bootstrap_options = "bordered")
pCO2
DIC
Predictors R2 AIC significant R2 AIC significant
TA 0.71 -2039.91 TA 0.82 -2915.00 (Intercept), TA
1/TA 0.17 -1829.24 invTA 0.05 -2600.20 (Intercept), invTA
TOC 0.88 -2192.88 TOC 0.45 -2709.82 (Intercept), TOC
TA + TOC 0.88 -2197.09 TOC, TA 0.84 -2933.13 (Intercept), TOC, TA
TA*TOC 0.88 -2197.09 TOC, TA 0.84 -2931.13
TA*Hplus 0.90 -2236.16 TA, Hplus, TA:Hplus 0.86 -2952.38 (Intercept), TA, Hplus
TOC*Hplus 0.88 -2193.20 TOC 0.50 -2724.46 (Intercept), TOC
TA*Hplus + TOC 0.90 -2234.25 TA, Hplus, TA:Hplus 0.86 -2950.38 (Intercept), TA, Hplus
TA + TOC*Hplus 0.89 -2208.10 TOC, TA 0.86 -2955.67 (Intercept), TA, TOC:Hplus
TA/Hplus + TOC 0.88 -2195.90 TOC, TA, invHplus 0.89 -2992.33 (Intercept), TOC, TA, invHplus, TA:invHplus
TA + TOC/Hplus 0.88 -2197.22 TOC, TA 0.85 -2947.14 (Intercept), TOC, TA, TOC:invHplus

4 Northern European Lakes Survey

g1 <- ggplot()+geom_boxplot(data = nlss, aes(x="TA\nNorthern European Lakes Survey, 1995", y=TA, col = "NLS_1995"))+
  geom_boxplot(data = norway, aes(x = "TA\n CBA (2019) and N112 (2004) surveys", y = TA, col = survey))+
  scale_color_manual(values = c("firebrick","orange","dodgerblue"))+
  labs(x = "", col = "",y = "Total alkalinity, eq/L")+
  theme_minimal(base_size = 15)+
  theme(legend.position = "bottom")

g2 <- ggplot()+geom_boxplot(data = nlss, aes(x="TOC\nNorthern European Lakes Survey, 1995", y=TOC, col = "NLS_1995"))+
  geom_boxplot(data = norway, aes(x = "TOC\n CBA (2019) and N112 (2004) surveys", y = TOC, col = survey))+
  scale_color_manual(values = c("firebrick","orange","dodgerblue"))+
  labs(x = "", col = "",y = "Total organic carbon, mol/L")+
  theme_minimal(base_size = 15)+
  theme(legend.position = "bottom")

g3 <- ggplot()+geom_boxplot(data = nlss, aes(x="pH\nNorthern European Lakes Survey, 1995", y=pH, col = "NLS_1995"))+
  geom_boxplot(data = norway, aes(x = "pH\n CBA (2019) and N112 (2004) surveys", y = pH, col = survey))+
  scale_color_manual(values = c("firebrick","orange","dodgerblue"))+
  labs(x = "", col = "",y = "pH")+
  theme_minimal(base_size = 15)+
  theme(legend.position = "bottom")

ggarrange(g1,g2,g3, common.legend = T, nrow = 3)

Figure 54: Comparison of the distribution of TA, TOC and pH in the CBA, N112 and NLS surveys

nlss$ta_CO3 <- with(nlss,(TA - OH + Hplus) / (Hplus/K2 +2)) # in mol/L
nlss$ta_HCO3 <- with(nlss, ta_CO3*Hplus/K2)
nlss$ta_H2CO3 <- with(nlss, ta_HCO3*Hplus/K1)
nlss$pCO2_ta <- with(nlss,ta_H2CO3/K0)

g1 <- ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_ta), alpha = 0.5, col = "firebrick")+
  labs(x = "pH", y = "pCO2 from TA, atm")+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  facet_zoom(y = pCO2_ta > 0, zoom.size = 1)+
  theme_bw(base_size = 15)

print(g1)

Figure 55: pCO2 calculated from TA in the NLS surveys

nlss$pCO2_toc <- predict(pco2.toc, newdata = nlss, type = "response")

g1 <- ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_ta, col = "pCO2 from carbonate equilibrium"), alpha = 0.5)+
  geom_point(aes(y = pCO2_toc, col = "pCO2 modelled with TOC"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "")+
  scale_color_manual(values = c("firebrick","dodgerblue"))+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")+
  facet_zoom(y = pCO2_ta > 0, zoom.size = 2)


g2 <- ggplot(data = nlss, aes(x = pH))+
  geom_point(aes(y = pCO2_ta, col = "pCO2 from carbonate equilibrium"), alpha = 0.5)+
  geom_point(aes(y = pCO2_toc, col = "pCO2 modelled with TOC"), alpha = 0.5)+
  geom_hline(yintercept = fCO2.atm.1995, lty = 2)+
  labs(x = "pH", col = "", y = "pCO2, atm")+
  scale_color_manual(values = c("firebrick","dodgerblue"))+
  ylim(0,0.006)+
  theme_bw(base_size = 15)+
  theme(legend.position = "top")

ggarrange(g2,g1, common.legend = T)

Figure 56: pCO2 calculated from TA, compared to pCO2 predicted with TOC, for the NLS surveys

ggplot(nlss)+geom_point(aes(x=long, y = lat, col = pCO2_toc*1e6))+
  borders(regions = c("Norway","Sweden","Finland"))+ylim(55,72)+xlim(4,32)+
  scale_fill_continuous_divergingx(palette = "RdYlBu", aesthetics = c("fill","col"), mid = 1000, rev = T)+
  labs(col="pCO2, uatm")+
  theme_void()

# Evasion rate of CO2

4.1 Theory

Wind speed https://wcrp-cmip.github.io/CMIP6_CVs/docs/CMIP6_institution_id.html

Data downloaded from Copernicus https://cds.climate.copernicus.eu/cdsapp#!/dataset/derived-near-surface-meteorological-variables?tab=overview

See part 1

Theory from Hastie et al. (2018)

\[FCO_2 = A_{lake} \times k \times \Delta CO_2\]

\(FCO_2\) in mol/day, \(A_{lake}\) = lake area in m2, \(\Delta CO_2\) in mol/m3 is the difference between water and air pCO2.

According to Vachon and Prairie (2013):

\[k_{600} = 2.51 + 1.48 \times U_{10} + 0.39 \times U_{10} \times log_{10} (A_{lake})\]

with \(k_600\) in \(cm/h\), \(A\) in km2 and \(U_{10}\) in m/s

Another widely used equation for k600 was introduced by Cole and Caraco (1998) and only uses the wind speed:

\[ k_{600} = 2.07 + 0.215 \times U_{10}^{1.7}\]

\(k_{CO_2}\) has to be adjusted for temperature (Crusius and Wanninkhof 2003) via the Schmidt number:

\[k_{600} = k_{CO_2}\times \left( \frac{600}{Sc_{CO_2}} \right) ^{-n}\]

The Schmidt \(Sc_{CO_2}\) number is calculated from Wanninkhof (2014):

\[S_c = 1923.6 - 125.06 \times T + 4.3773 \times T^2 - 0.085681 \times T^3\]

4.2 Compute ECO2

We calculate the gas transfer coefficient kCO2, based on the normalised gas transfer coefficient k600 and the Schmidt number.

norway$pCO2_pred <- predict(pco2.toc, newdata = norway)

norway$Sc <- with(norway, 1923.6-125.06*temp_c+4.3733*temp_c^2-0.086781*temp_c^3) #@wanningkhof

norway$n <- NA
for(i in 1:length(norway$n)){
  if(norway$nws_m.s[i] < 3.7 & is.na(norway$nws_m.s[i]) == F){
    norway$n[i] = 2/3
  }else if(norway$nws_m.s[i] > 3.7 & is.na(norway$nws_m.s[i]) == F){
    norway$n[i] = 1/2
  } else if(is.na(norway$nws_m.s[i]) == T){
    norway$n[i] = NA
  }
} # adjusted n from Vachon


norway$k600 <- with(norway, 2.51+1.48*nws_m.s+0.39*nws_m.s*log10(lake.area)) # lake.area already in km2, in cm/h
                    
norway$kCO2 <- with(norway, k600*(600/Sc)^(n))*24*1e-2 # from cm/h to m/day

norway$k600_low <- with(norway, 2.07+0.25*nws_m.s^1.7) # lake area in km2. Cole & Caraco 1998
                    
norway$kCO2_low <- with(norway, k600_low*(600/Sc)^(n))*24*1e-2

norway$ECO2 <- with(norway, kCO2*(fCO2-fCO2.atm)*K0*1e3 * 12.011) # in gC/m2/day

norway$ECO2_low <- with(norway, kCO2_low*(fCO2-fCO2.atm)*K0*1e3 * 12.011) # in gC/m2/day

norway$ECO2_toc <- with(norway, kCO2*(pCO2_pred-fCO2.atm)*K0*1e3 * 12.011) # in gC/m2/day

norway$ECO2_toc_low <- with(norway, kCO2_low*(pCO2_pred-fCO2.atm)*K0*1e3 * 12.011) # in gC/m2/day


norway$ECO2_ta <- with(norway, kCO2*(ta_fCO2-fCO2.atm)*K0*1e3 * 12.011) # in gC/m2/day

norway$ECO2_ta_low <- with(norway, kCO2_low*(ta_fCO2-fCO2.atm)*K0*1e3 * 12.011) # in gC/m2/day
nlss$Sc <- with(nlss, 1923.6-125.06*temp_c+4.3733*temp_c^2-0.086781*temp_c^3) #@wanningkhof

nlss$n <- NA
for(i in 1:length(nlss$n)){
  if(nlss$nws_m.s[i] < 3.7 & is.na(nlss$nws_m.s[i]) == F){
    nlss$n[i] = 2/3
  }else if(nlss$nws_m.s[i] > 3.7 & is.na(nlss$nws_m.s[i]) == F){
    nlss$n[i] = 1/2
  } else if(is.na(nlss$nws_m.s[i]) == T){
    nlss$n[i] = NA
  }
}


nlss$k600 <- with(nlss, 2.51+1.48*nws_m.s+0.39*nws_m.s*log10(lake.area)) # lake area in km2
                    
nlss$kCO2 <- with(nlss, k600*(600/Sc)^(n))*24*1e-2

nlss$k600_low <- with(nlss, 2.07+0.25*nws_m.s^1.7) # lake area in km2. Cole & Caraco 1998
nlss$kCO2_low <- with(nlss, k600_low*(600/Sc)^(n))*24*1e-2

nlss$ECO2_toc <- with(nlss, kCO2*(pCO2_toc-fCO2.atm)*K0*1e3 * 12.011) # in gC/m2/day
nlss$ECO2_toc_low <- with(nlss, kCO2_low*(pCO2_toc-fCO2.atm)*K0*1e3 * 12.011)

nlss$ECO2_ta <- with(nlss, kCO2*(pCO2_ta-fCO2.atm)*K0*1e3 * 12.011) # in gC/m2/day
nlss$ECO2_ta_low <- with(nlss, kCO2_low*(pCO2_ta-fCO2.atm)*K0*1e3 * 12.011) # in gC/m2/day

4.3 Compare ECO2 from TOC and ECO2 from TA

Comparison of the ECO2 obtained by calculation pCO2 from TOC or TA (in that case, we remove samples in which pH < 5.4).

Table 2: Median of pCO2 and ECO2, pCO2 predicted from TOC

no <- norway %>% 
  dplyr::select(c("survey", "fCO2" ,"pCO2_pred", "ECO2_toc", "ECO2_toc_low")) %>% 
  mutate(nation = "Norway") %>% 
  setNames(c("survey","measured_pCO2","pCO2_toc","ECO2_high","ECO2_low","nation"))  %>% 
  dplyr::select(c("survey","nation","measured_pCO2","pCO2_toc","ECO2_high","ECO2_low"))

fns <- nlss %>% as_data_frame() %>%
  dplyr::select(c("nation", "pCO2_toc", "ECO2_toc", "ECO2_toc_low")) %>% 
  mutate(fCO2 = NA, survey = "nls_1995") %>% 
  setNames(c("nation","pCO2_toc","ECO2_high", "ECO2_low","measured_pCO2","survey")) %>% 
  dplyr::select(c("survey","nation","measured_pCO2","pCO2_toc","ECO2_high","ECO2_low"))

summary_eco2 <- rbind(no,fns) %>% group_by(nation, survey) %>% summarize_all(median, na.rm = T) 
kable(summary_eco2) %>% kable_styling()
nation survey measured_pCO2 pCO2_toc ECO2_high ECO2_low
Finland nls_1995 0.0010432 0.6452481 0.3917249
Norway CBA_2019 0.0011346 0.0015690 0.8252406 0.4543466
Norway N112_2004 0.0006845 0.0006680 0.2287118 0.1213060
Norway nls_1995 0.0005835 0.1127527 0.0659726
Sweden nls_1995 0.0009819 0.5745514 0.3606773

Table 3: Median of pCO2 and ECO2, pCO2 calculated from TA (excluding samples with pH < 5.4)

no <- norway %>% 
  dplyr::select(c("survey", "pH", "fCO2", "ta_fCO2", "ECO2_ta", "ECO2_ta_low")) %>% 
  mutate(nation = "Norway") %>% 
  setNames(c("survey","pH","measured_pCO2","pCO2_ta","ECO2_high","ECO2_low","nation"))  %>%
  dplyr::select(c("survey","nation","pH","measured_pCO2","pCO2_ta","ECO2_high","ECO2_low"))

fns <- nlss %>% as_data_frame() %>%
  dplyr::select(c("nation", "pH","pCO2_ta","ECO2_ta", "ECO2_ta_low")) %>% 
  mutate(fCO2 = NA, survey = "nls_1995") %>% 
  setNames(c("nation","pH","pCO2_ta","ECO2_high", "ECO2_low","measured_pCO2","survey")) %>% 
  dplyr::select(c("survey","nation","pH","measured_pCO2","pCO2_ta","ECO2_high","ECO2_low"))

summary_eco2_ta <- rbind(no,fns) %>% filter(pH > 5.4) %>% group_by(nation, survey) %>% summarize_all(median, na.rm = T) 
kable(summary_eco2_ta) %>% kable_styling()
nation survey pH measured_pCO2 pCO2_ta ECO2_high ECO2_low
Finland nls_1995 6.70 0.0013406 0.8801672 0.5303916
Norway CBA_2019 6.87 0.0011346 0.0015188 0.6997074 0.3625881
Norway N112_2004 6.07 0.0006381 0.0009285 0.3790531 0.1897891
Norway nls_1995 6.54 0.0007590 0.1992923 0.1199373
Sweden nls_1995 6.85 0.0011476 0.7451497 0.4701856

4.4 Summary table with median and mean, TOC

Table 4: Median and mean of pCO2 and ECO2, pCO2 predicted from TOC, by survey

no <- norway %>% 
  dplyr::select(c("survey", "fCO2" , "ECO2", "ECO2_toc","ECO2_low", "ECO2_toc_low")) 

medyear <- function(x){median(x, na.rm = T) * 365} # to convert gC/m2/d in gC/m2/year
avyear <- function(x){mean(x, na.rm = T) * 365}

summary_eco2.1 <- no %>% group_by(survey) %>% summarize_all(medyear) %>% mutate(fCO2 = fCO2/365*1e6)

summary_eco2.2 <- no %>% group_by(survey) %>% summarize_all(avyear) %>% mutate(fCO2 = fCO2/365*1e6)

no2 <- rbind(summary_eco2.1, summary_eco2.2) %>% arrange(survey) %>% mutate(survey = c("median","mean","median","mean"))

fen <- nlss %>% dplyr::select(c("pCO2_toc","ECO2_toc","ECO2_toc_low")) %>% 
  mutate(ECO2 = NA, ECO2_low = NA, fCO2 = pCO2_toc/365*1e6, survey = NA) %>%
  dplyr::select(c("survey","fCO2", "ECO2", "ECO2_toc", "ECO2_low","ECO2_toc_low"))
fen$lake.id <- NULL

summary_fen.1 <- fen %>% group_by(survey) %>% summarize_all(medyear) %>% mutate(survey= "median")
summary_fen.2 <- fen %>% group_by(survey) %>% summarize_all(avyear) %>% mutate(survey= "mean")

tot <- rbind(no2, summary_fen.1) %>% rbind(summary_fen.2)

kable(tot, digits = 2, col.names = c(" ","pCO2, uatm", "ECO2 from DIC in gC/m2/day","ECO2 from TOC in gC/m2/day","ECO2 from DIC in gC/m2/day","ECO2 from TOC in gC/m2/day")) %>% 
  kable_styling %>% 
  group_rows(group_label= "CBA", start_row = 1, end_row = 2) %>%
  group_rows(group_label= "N112", start_row = 3, end_row = 4) %>% 
  group_rows(group_label= "NLS", start_row = 5, end_row = 6) %>% 
  add_header_above(c(" " = 2, "High estimates" = 2, "Low estimates" = 2)) 
High estimates
Low estimates
pCO2, uatm ECO2 from DIC in gC/m2/day ECO2 from TOC in gC/m2/day ECO2 from DIC in gC/m2/day ECO2 from TOC in gC/m2/day
CBA
median 1134.57 183.74 301.21 100.56 165.84
mean 1517.26 264.19 337.36 157.73 184.89
N112
median 684.50 80.27 83.48 46.72 44.28
mean 779.18 109.49 95.35 60.31 51.87
NLS
median 929.36 175.38 106.94
mean 998.84 209.77 139.10
kable(tot, digits = 2, col.names = c(" ","pCO2, uatm", "ECO2 from DIC in gC/m2/year","ECO2 from TOC in gC/m2/year","ECO2 from DIC in gC/m2/year","ECO2 from TOC in gC/m2/year")) %>% 
  kable_styling() %>% 
  group_rows(group_label= "CBA", start_row = 1, end_row = 2) %>%
  group_rows(group_label= "N112", start_row = 3, end_row = 4) %>%
  group_rows(group_label= "NLS", start_row = 5, end_row = 6) %>% 
  add_header_above(c(" " = 2, "High estimates" = 2, "Low estimates" = 2)) %>%
  save_kable(file = "compare.eco2.measured.png",zoom = 5)

4.5 Compare total emissions calculated from bins, with total emissions calculated from median and mean

We also compare with the efflux calculated by Hastie et al. (2018) by class of area. The efflux is given in TgC/year. We convert it to gC/m2/year for each country.

Table 5: Total efflux of CO2 computed from the median for each lake category, compared to efflux calculated for all lakes (Norway, Sweden, Finland)

options(knitr.kable.NA = '')

p0 <- median(nlss$pCO2_toc, na.rm = T)*1e6
A0 <- sum(nlss$lake.area)
k0 <- nlss$kCO2 %>% median(na.rm = T) 
k0_low <- nlss$kCO2_low %>% median(na.rm = T)
E0 <- median(nlss$ECO2_toc * 365, na.rm = T) # gC/m2/y
E0_low <- median(nlss$ECO2_toc_low * 365, na.rm = T) # gC/m2/y

F0_realsum <-  with(nlss, sum(ECO2_toc * 365 *lake.area*1e6/12.011 * 43.99 * 1e-12, na.rm = T)) #MtCO2/y
F0_sumfrommedian <- with(nlss, E0*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)

F0_low_realsum <-  with(nlss, sum(ECO2_toc_low * 365 *lake.area*1e6/12.011 * 43.99 * 1e-12, na.rm = T)) #MtCO2/y
F0_low_sumfrommedian <- with(nlss, E0_low*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)

# Lake size < 0.1 km2

p1 <- nlss %>% subset(lake.area < 0.1) %>% pull(pCO2_toc) %>% median()*1e6
A1 <- nlss %>% subset(lake.area < 0.1) %>% pull(lake.area) %>% sum()

k1 <- with(subset(nlss, lake.area < 0.1), median(kCO2, na.rm = T)) 
E1 <- with(subset(nlss, lake.area < 0.1), median(ECO2_toc*365, na.rm = T)) #gC/m2/y

F1_realsum <- with(subset(nlss, lake.area < 0.1), sum(ECO2_toc * 365 *lake.area*1e6/12.011 * 43.99 * 1e-12, na.rm = T)) #MtCO2/y
F1_sumfrombin <- with(subset(nlss, lake.area < 0.1), E1*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)

  
k1_low <- with(subset(nlss, lake.area < 0.1), median(kCO2_low, na.rm = T)) 
E1_low <- with(subset(nlss, lake.area < 0.1), median(ECO2_toc_low*365, na.rm = T)) #gC/m2/y

F1_low_realsum <- with(subset(nlss, lake.area < 0.1), sum(ECO2_toc_low * 365/12.011 * 43.99  * 1e-12*lake.area*1e6, na.rm = T)) #MtCO2/y
F1_low_sumfrombin <- with(subset(nlss, lake.area < 0.1), E1_low*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)


W1 <- 38.17*1e12/(208008*1e6) # gC/m2/year in Weyhenmeyer
W1_sum <- with(subset(nlss, lake.area < 0.1), W1*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12) # MtCO2/y

# 0.1 < Lake size < 1 km2

p2 <- nlss %>% subset(lake.area < 1 & lake.area > 0.1) %>% pull(pCO2_toc) %>% median()*1e6
A2 <- nlss %>% subset(lake.area < 1 & lake.area > 0.1) %>% pull(lake.area) %>% sum()

k2 <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), median(kCO2, na.rm = T)) 
E2 <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), median(ECO2_toc*365, na.rm = T)) #gC/m2/y

F2_realsum <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), sum(ECO2_toc * 365/12.011 * 43.99 * 1e-12*lake.area*1e6, na.rm = T)) #MtCO2/y
F2_sumfrombin <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), E2*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)


k2_low <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), median(kCO2_low, na.rm = T)) 
E2_low <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), median(ECO2_toc_low*365, na.rm = T)) #gC/m2/y

F2_low_realsum <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), sum(ECO2_toc_low * 365/12.011 * 43.99 * 1e-12*lake.area*1e6, na.rm = T)) #MtCO2/y
F2_low_sumfrombin <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), E2_low*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)



W2 <- 52.36*1e12/(282434*1e6) # gC/m2/year in W
W2_sum <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), W2*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12) # MtCO2/y

# 1 < Lake size < 10 km2

p3 <- nlss %>% subset(lake.area > 1 & lake.area < 10) %>% pull(pCO2_toc) %>% median()*1e6
A3 <- nlss %>% subset(lake.area > 1 & lake.area < 10) %>% pull(lake.area) %>% sum()

k3 <- with(subset(nlss, lake.area > 1 & lake.area < 10), median(kCO2, na.rm = T)) 
E3 <- with(subset(nlss, lake.area > 1 & lake.area < 10), median(ECO2_toc*365, na.rm = T)) #gC/m2/y

F3_realsum <- with(subset(nlss, lake.area > 1 & lake.area < 10), sum(ECO2_toc * 365/12.011 * 43.99 * 1e-12*lake.area*1e6, na.rm = T)) #MtCO2/y
F3_sumfrombin <- with(subset(nlss, lake.area > 1 & lake.area < 10), E3*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)


k3_low <- with(subset(nlss, lake.area > 1 & lake.area < 10), median(kCO2_low, na.rm = T)) 
E3_low <- with(subset(nlss, lake.area > 1 & lake.area < 10), median(ECO2_toc_low*365, na.rm = T)) #gC/m2/y

F3_low_realsum <- with(subset(nlss, lake.area > 1 & lake.area < 10), sum(ECO2_toc_low * 365/12.011 * 43.99  * 1e-12*lake.area*1e6, na.rm = T)) #MtCO2/y
F3_low_sumfrombin <- with(subset(nlss, lake.area > 1 & lake.area < 10), E3_low*sum(lake.area*1e6)/12.011 * 43.99  * 1e-12)


W3 <- 22.80*1e12/(286624*1e6) # gC/m2/year in W
W3_sum <- with(subset(nlss, lake.area > 1 & lake.area < 10), W3*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12) # MtCO2/y

# 1 < Lake size < 10 km2

p4 <- nlss %>% subset(lake.area > 10) %>% pull(pCO2_toc) %>% median()*1e6
A4 <- nlss %>% subset(lake.area > 10) %>% pull(lake.area) %>% sum()

k600_4 <- with(subset(nlss, lake.area > 10), median(k600*24*1e-2, na.rm = T))
k600_4_low <- with(subset(nlss, lake.area > 10), median(k600_low*24*1e-2, na.rm = T))

k4 <- with(subset(nlss, lake.area > 10), median(kCO2, na.rm = T)) 
E4 <- with(subset(nlss, lake.area > 10), median(ECO2_toc*365, na.rm = T)) #gC/m2/y

F4_realsum <- with(subset(nlss, lake.area > 10), sum(ECO2_toc* 365*lake.area*1e6/12.011 * 43.99  * 1e-12, na.rm = T)) #MtCO2/y
F4_sumfrombin <- with(subset(nlss, lake.area > 10), E4*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)



k4_low <- with(subset(nlss, lake.area > 10), median(kCO2_low, na.rm = T)) 
E4_low <- with(subset(nlss, lake.area > 10), median(ECO2_toc_low*365, na.rm = T)) #gC/m2/y

F4_low_realsum <- with(subset(nlss, lake.area > 10), sum(ECO2_toc_low* 365*lake.area*1e6/12.011 * 43.99  * 1e-12, na.rm = T)) #MtCO2/y
F4_low_sumfrombin <- with(subset(nlss, lake.area > 10), E4_low*sum(lake.area*1e6)/12.011 * 43.99  * 1e-12)


W4 <- 53.96*1e12/(570583*1e6) # gC/m2/year in W
W4_sum <- with(subset(nlss, lake.area > 10), W4*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12) # MtCO2/y

bin_FCO2 <- data.frame(lake.size = c("A < 0.1","0.1 < A < 1","1 < A < 10","> 10","sum"," "), 
          lake.area = c(A1, A2, A3, A4, NA,A0),
           pCO2 = c(p1,p2,p3,p4,NA,p0), 
           kCO2_low = c(k1_low,k2_low,k3_low,k4_low,NA,k0_low), 
           E_low = c(E1_low,E2_low,E3_low,E4_low,NA,E0_low),
           FCO2_low_bin = c(F1_low_sumfrombin, F2_low_sumfrombin, F3_low_sumfrombin, F4_low_sumfrombin, sum(F1_low_sumfrombin, F2_low_sumfrombin, F3_low_sumfrombin, F4_low_sumfrombin),F0_low_sumfrommedian),
            FCO2_low = c(F1_low_realsum, F2_low_realsum, F3_low_realsum, F4_low_realsum,sum(F1_low_realsum,F2_low_realsum,F3_low_realsum,F4_low_realsum),F0_low_realsum),
           kCO2_high = c(k1,k2,k3,k4, NA,k0),
           E_high = c(E1,E2,E3,E4, NA,E0),
           FCO2_high_bin = c(F1_sumfrombin, F2_sumfrombin, F3_sumfrombin, F4_sumfrombin, sum(F1_sumfrombin, F2_sumfrombin, F3_sumfrombin, F4_sumfrombin),F0_sumfrommedian),
          FCO2_high = c(F1_realsum, F2_realsum, F3_realsum, F4_realsum, sum(F1_realsum, F2_realsum, F3_realsum, F4_realsum),F0_realsum),
           Wey = c(W1,W2,W3,W4, NA,NA),
           Wey_sum = c(W1_sum, W2_sum, W3_sum, W4_sum, sum(W1_sum, W2_sum, W3_sum, W4_sum),NA))


kable(bin_FCO2, digits = 2, col.names = c("A = lake area in km2","Total lake area in km2","median pCO2, uatm","median kCO2, m2/d","median ECO2, g/m2/y", "binned FCO2, MtCO2/y","actual FCO2, MtCO2/y","kCO2, m2/d","median ECO2, g/m2/y", "binned FCO2, MtCO2/y", "actual FCO2, MtCO2/y","median ECO2, g/m2/day", "FCO2, MtCO2/y")) %>% kable_styling() %>% 
  add_header_above(c("Lake class" = 2, " " = 1, "Low estimate" = 4, "High estimate" = 4, "Hastie et al." = 2)) %>%
  group_rows(group_label= "All lakes", start_row = 6, end_row = 6) %>%
  column_spec(column = c(5,6,9,10,12), bold = T)
Lake class
Low estimate
High estimate
Hastie et al.
A = lake area in km2 Total lake area in km2 median pCO2, uatm median kCO2, m2/d median ECO2, g/m2/y binned FCO2, MtCO2/y actual FCO2, MtCO2/y kCO2, m2/d median ECO2, g/m2/y binned FCO2, MtCO2/y actual FCO2, MtCO2/y median ECO2, g/m2/day FCO2, MtCO2/y
A < 0.1 89.23 946.87 0.55 91.89 0.03 0.05 0.78 131.26 0.04 0.06 183.50 0.06
0.1 < A < 1 449.17 955.63 0.73 120.00 0.20 0.23 1.14 188.26 0.31 0.37 185.39 0.30
1 < A < 10 2386.51 911.84 0.78 113.96 1.00 1.13 1.44 213.85 1.87 2.10 79.55 0.70
> 10 32393.63 872.44 0.66 98.68 11.71 14.08 1.57 234.54 27.83 36.60 94.57 11.22
sum 12.93 15.49 30.05 39.14 12.28
All lakes
35318.55 929.36 0.64 106.94 13.83 15.49 1.03 175.38 22.69 39.14

Table 6: Total efflux of CO2 computed from the mean for each lake category, compared to efflux calculated for all lakes (Norway, Sweden, Finland)

p0 <- mean(nlss$pCO2_toc, na.rm = T)*1e6
A0 <- sum(nlss$ake.area)
k0 <- nlss$kCO2 %>% mean(na.rm = T) 
k0_low <- nlss$kCO2_low %>% mean(na.rm = T)
E0 <- mean(nlss$ECO2_toc * 365, na.rm = T) # gC/m2/y
E0_low <- mean(nlss$ECO2_toc_low * 365, na.rm = T) # gC/m2/y

F0_realsum <-  with(nlss, sum(ECO2_toc * 365 *lake.area*1e6/12.011 * 43.99 * 1e-12, na.rm = T)) #MtCO2/y
F0_sumfrommean <- with(nlss, E0*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)

F0_low_realsum <-  with(nlss, sum(ECO2_toc_low * 365 *lake.area*1e6/12.011 * 43.99 * 1e-12, na.rm = T)) #MtCO2/y
F0_low_sumfrommean <- with(nlss, E0_low*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)

# Lake size < 0.1 km2

p1 <- nlss %>% subset(lake.area < 0.1) %>% pull(pCO2_toc) %>% mean()*1e6
A1 <- nlss %>% subset(lake.area < 0.1) %>% pull(lake.area) %>% sum()

k1 <- with(subset(nlss, lake.area < 0.1), mean(kCO2, na.rm = T)) 
E1 <- with(subset(nlss, lake.area < 0.1), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y

F1_realsum <- with(subset(nlss, lake.area < 0.1), sum(ECO2_toc * 365 *lake.area*1e6/12.011 * 43.99 * 1e-12, na.rm = T)) #MtCO2/y
F1_sumfrombin <- with(subset(nlss, lake.area < 0.1), E1*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)

  
k1_low <- with(subset(nlss, lake.area < 0.1), mean(kCO2_low, na.rm = T)) 
E1_low <- with(subset(nlss, lake.area < 0.1), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y

F1_low_realsum <- with(subset(nlss, lake.area < 0.1), sum(ECO2_toc_low * 365/12.011 * 43.99  * 1e-12*lake.area*1e6, na.rm = T)) #MtCO2/y
F1_low_sumfrombin <- with(subset(nlss, lake.area < 0.1), E1_low*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)


W1 <- 38.17*1e12/(208008*1e6) # gC/m2/year in Weyhenmeyer
W1_sum <- with(subset(nlss, lake.area < 0.1), W1*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12) # MtCO2/y

# 0.1 < Lake size < 1 km2

p2 <- nlss %>% subset(lake.area < 1 & lake.area > 0.1) %>% pull(pCO2_toc) %>% mean()*1e6

k2 <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), mean(kCO2, na.rm = T)) 
E2 <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y

F2_realsum <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), sum(ECO2_toc * 365/12.011 * 43.99 * 1e-12*lake.area*1e6, na.rm = T)) #MtCO2/y
F2_sumfrombin <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), E2*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)


k2_low <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), mean(kCO2_low, na.rm = T)) 
E2_low <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y

F2_low_realsum <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), sum(ECO2_toc_low * 365/12.011 * 43.99 * 1e-12*lake.area*1e6, na.rm = T)) #MtCO2/y
F2_low_sumfrombin <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), E2_low*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)



W2 <- 52.36*1e12/(282434*1e6) # gC/m2/year in W
W2_sum <- with(subset(nlss, lake.area < 1 & lake.area > 0.1), W2*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12) # MtCO2/y

# 1 < Lake size < 10 km2

p3 <- nlss %>% subset(lake.area > 1 & lake.area < 10) %>% pull(pCO2_toc) %>% mean()*1e6

k3 <- with(subset(nlss, lake.area > 1 & lake.area < 10), mean(kCO2, na.rm = T)) 
E3 <- with(subset(nlss, lake.area > 1 & lake.area < 10), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y

F3_realsum <- with(subset(nlss, lake.area > 1 & lake.area < 10), sum(ECO2_toc * 365/12.011 * 43.99 * 1e-12*lake.area*1e6, na.rm = T)) #MtCO2/y
F3_sumfrombin <- with(subset(nlss, lake.area > 1 & lake.area < 10), E3*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)


k3_low <- with(subset(nlss, lake.area > 1 & lake.area < 10), mean(kCO2_low, na.rm = T)) 
E3_low <- with(subset(nlss, lake.area > 1 & lake.area < 10), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y

F3_low_realsum <- with(subset(nlss, lake.area > 1 & lake.area < 10), sum(ECO2_toc_low * 365/12.011 * 43.99  * 1e-12*lake.area*1e6, na.rm = T)) #MtCO2/y
F3_low_sumfrombin <- with(subset(nlss, lake.area > 1 & lake.area < 10), E3_low*sum(lake.area*1e6)/12.011 * 43.99  * 1e-12)


W3 <- 22.80*1e12/(286624*1e6) # gC/m2/year in W
W3_sum <- with(subset(nlss, lake.area > 1 & lake.area < 10), W3*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12) # MtCO2/y

# 1 < Lake size < 10 km2

p4 <- nlss %>% subset(lake.area > 10) %>% pull(pCO2_toc) %>% mean()*1e6

k600_4 <- with(subset(nlss, lake.area > 10), mean(k600*24*1e-2, na.rm = T))
k600_4_low <- with(subset(nlss, lake.area > 10), mean(k600_low*24*1e-2, na.rm = T))

k4 <- with(subset(nlss, lake.area > 10), mean(kCO2, na.rm = T)) 
E4 <- with(subset(nlss, lake.area > 10), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y

F4_realsum <- with(subset(nlss, lake.area > 10), sum(ECO2_toc* 365*lake.area*1e6/12.011 * 43.99  * 1e-12, na.rm = T)) #MtCO2/y
F4_sumfrombin <- with(subset(nlss, lake.area > 10), E4*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12)



k4_low <- with(subset(nlss, lake.area > 10), mean(kCO2_low, na.rm = T)) 
E4_low <- with(subset(nlss, lake.area > 10), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y

F4_low_realsum <- with(subset(nlss, lake.area > 10), sum(ECO2_toc_low* 365*lake.area*1e6/12.011 * 43.99  * 1e-12, na.rm = T)) #MtCO2/y
F4_low_sumfrombin <- with(subset(nlss, lake.area > 10), E4_low*sum(lake.area*1e6)/12.011 * 43.99  * 1e-12)


W4 <- 53.96*1e12/(570583*1e6) # gC/m2/year in W
W4_sum <- with(subset(nlss, lake.area > 10), W4*sum(lake.area*1e6)/12.011 * 43.99 * 1e-12) # MtCO2/y

bin_FCO2 <- data.frame(lake.size = c("A < 0.1","0.1 < A < 1","1 < A < 10","> 10","sum"," "), 
           pCO2 = c(p1,p2,p3,p4,NA,p0), 
           kCO2_low = c(k1_low,k2_low,k3_low,k4_low,NA,k0_low), 
           E_low = c(E1_low,E2_low,E3_low,E4_low,NA,E0_low),
           FCO2_low_bin = c(F1_low_sumfrombin, F2_low_sumfrombin, F3_low_sumfrombin, F4_low_sumfrombin, sum(F1_low_sumfrombin, F2_low_sumfrombin, F3_low_sumfrombin, F4_low_sumfrombin),F0_low_sumfrommean),
            FCO2_low = c(F1_low_realsum, F2_low_realsum, F3_low_realsum, F4_low_realsum,sum(F1_low_realsum,F2_low_realsum,F3_low_realsum,F4_low_realsum),F0_low_realsum),
           kCO2_high = c(k1,k2,k3,k4, NA,k0),
           E_high = c(E1,E2,E3,E4, NA,E0),
           FCO2_high_bin = c(F1_sumfrombin, F2_sumfrombin, F3_sumfrombin, F4_sumfrombin, sum(F1_sumfrombin, F2_sumfrombin, F3_sumfrombin, F4_sumfrombin),F0_sumfrommean),
          FCO2_high = c(F1_realsum, F2_realsum, F3_realsum, F4_realsum, sum(F1_realsum, F2_realsum, F3_realsum, F4_realsum),F0_realsum),
           Wey = c(W1,W2,W3,W4, NA,NA),
           Wey_sum = c(W1_sum, W2_sum, W3_sum, W4_sum, sum(W1_sum, W2_sum, W3_sum, W4_sum),NA))


kable(bin_FCO2, digits = 2, col.names = c("A = lake area in km2","mean pCO2, uatm","mean kCO2, m2/d","mean ECO2, g/m2/y", "binned FCO2, MtCO2/y","actual FCO2, MtCO2/y","kCO2, m2/d","mean ECO2, g/m2/y", "binned FCO2, MtCO2/y", "actual FCO2, MtCO2/y","mean ECO2, g/m2/day", "FCO2, MtCO2/y")) %>% kable_styling() %>% 
  add_header_above(c("Lake class" = 1, " " = 1, "Low estimate" = 4, "High estimate" = 4, "Hastie et al." = 2)) %>%
  group_rows(group_label= "All lakes", start_row = 6, end_row = 6) %>%
  column_spec(column = c(5,6,9,10,12), bold = T)
Lake class
Low estimate
High estimate
Hastie et al.
A = lake area in km2 mean pCO2, uatm mean kCO2, m2/d mean ECO2, g/m2/y binned FCO2, MtCO2/y actual FCO2, MtCO2/y kCO2, m2/d mean ECO2, g/m2/y binned FCO2, MtCO2/y actual FCO2, MtCO2/y mean ECO2, g/m2/day FCO2, MtCO2/y
A < 0.1 1034.90 0.64 142.24 0.05 0.05 0.85 183.69 0.06 0.06 183.50 0.06
0.1 < A < 1 1019.58 0.71 146.78 0.24 0.23 1.10 220.73 0.36 0.37 185.39 0.30
1 < A < 10 939.33 0.74 130.85 1.14 1.13 1.35 234.94 2.05 2.10 79.55 0.70
> 10 860.24 0.73 106.64 12.65 14.08 1.63 234.97 27.88 36.60 94.57 11.22
sum 14.08 15.49 30.35 39.14 12.28
All lakes
998.84 0.69 139.10 17.99 15.49 1.09 209.77 27.13 39.14

4.6 Bin estimation by country for all inland waters

We estimated the share of inland waters belonging to a given category of lakes (depending on area in km2) based on the proportion represented by each category in the total of the lakes sampled during the Northern European Lakes Survey.

The largest lakes of Norway, Sweden and Finland account for a large part of the total inland waters. Therefore, we included them as a separated category. We assumed that all the lakes with an area larger than 100 km2 were samples in the NLS. Therefore, the CO2 efflux for these big lakes were simply added to the efflux calculated for the rest of the inland water.

ggplot(nlss)+geom_boxplot(aes(x=nation, y = lake.area, col = nation))+coord_trans(y = "log10")+
  scale_color_manual(values = c("firebrick","orange","dodgerblue3"))+
  geom_hline(yintercept = 100)+
  labs(y = "Lake area, km2", x = "")+
  scale_y_continuous(breaks = c(0.1,1,10,100))+
  theme_minimal()

Table 7: Total efflux of CO2 computed from the mean for each lake category, compared to efflux calculated for all lakes in Norway

nor <- filter(nlss, nation == "Norway")

inland.waters <- 20221 * 1e6 # km2 to m2, @SSB

p0 <- mean(nor$pCO2_toc, na.rm = T)*1e6
k0 <- mean(nor$kCO2, na.rm = T)
E0 <- mean(nor$ECO2_toc, na.rm = T)*365 # in gC/m2/year
F0 <- E0*inland.waters/12.011 * 43.99 * 1e-12 #MtCO2/t

k0_low <- mean(nor$kCO2_low, na.rm = T)
E0_low <- mean(nor$ECO2_toc_low, na.rm = T) *365 # in gC/m2/year
F0_low <- E0_low*inland.waters/12.011 * 43.99 * 1e-12 #MtCO2/t


A0 <- sum(subset(nor, lake.area < 100)$lake.area) # exclude big lakes

# Lake size < 0.1 km2

p1 <- nor %>% subset(lake.area < 0.1) %>% pull(pCO2_toc) %>% mean()*1e6
A1 <- nor %>% subset(lake.area < 0.1) %>% pull(lake.area) %>% sum()
A1_inland <- A1/A0*inland.waters

k1 <- with(subset(nor, lake.area < 0.1), mean(kCO2, na.rm = T)) 
E1 <- with(subset(nor, lake.area < 0.1), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F1 <- E1*A1_inland/12.011 * 43.99 * 1e-12 #MtCO2/t

  
k1_low <- with(subset(nor, lake.area < 0.1), mean(kCO2_low, na.rm = T)) 
E1_low <- with(subset(nor, lake.area < 0.1), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F1_low <- E1_low*A1_inland/12.011 * 43.99 * 1e-12

# 0.1 < Lake size < 1 km2

p2 <- nor %>% subset(lake.area < 1 & lake.area > 0.1) %>% pull(pCO2_toc) %>% mean()*1e6
A2 <- nor %>% subset(lake.area < 1 & lake.area > 0.1) %>% pull(lake.area) %>% sum()
A2_inland <- A2/A0*inland.waters

k2 <- with(subset(nor, lake.area < 1 & lake.area > 0.1), mean(kCO2, na.rm = T)) 
E2 <- with(subset(nor, lake.area < 1 & lake.area > 0.1), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F2 <- E2*A2_inland/12.011 * 43.99 * 1e-12


k2_low <- with(subset(nor, lake.area < 1 & lake.area > 0.1), mean(kCO2_low, na.rm = T)) 
E2_low <- with(subset(nor, lake.area < 1 & lake.area > 0.1), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F2_low <- E2_low*A2_inland/12.011 * 43.99 * 1e-12

# 1 < Lake size < 10 km2

p3 <- nor %>% subset(lake.area > 1 & lake.area < 10) %>% pull(pCO2_toc) %>% mean()*1e6
A3 <- nor %>% subset(lake.area > 1 & lake.area < 10) %>% pull(lake.area) %>% sum()
A3_inland <- A3/A0*inland.waters

k3 <- with(subset(nor, lake.area > 1 & lake.area < 10), mean(kCO2, na.rm = T)) 
E3 <- with(subset(nor, lake.area > 1 & lake.area < 10), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y

F3 <- E3*A3_inland/12.011 * 43.99 * 1e-12


k3_low <- with(subset(nor, lake.area > 1 & lake.area < 10), mean(kCO2_low, na.rm = T)) 
E3_low <- with(subset(nor, lake.area > 1 & lake.area < 10), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F3_low <- E3_low*A3_inland/12.011 * 43.99  * 1e-12

# 10 < Lake size < 100 km2

p4 <- nor %>% subset(lake.area > 10 & lake.area < 100) %>% pull(pCO2_toc) %>% mean()*1e6
A4 <- nor %>% subset(lake.area > 10 & lake.area < 100) %>% pull(lake.area) %>% sum()
A4_inland <- A4/A0*inland.waters

k4 <- with(subset(nor, lake.area > 10 & lake.area < 100), mean(kCO2, na.rm = T)) 
E4 <- with(subset(nor, lake.area > 10 & lake.area < 100), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F4 <- E4*A4_inland/12.011 * 43.99 * 1e-12

k4_low <- with(subset(nor, lake.area > 10 & lake.area < 100), mean(kCO2_low, na.rm = T)) 
E4_low <- with(subset(nor, lake.area > 10 & lake.area < 100), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F4_low <- E4_low*A4_inland/12.011 * 43.99  * 1e-12

# Big lakes > 100 km2
p5 <- nor %>% subset(lake.area > 100) %>% pull(pCO2_toc) %>% mean()*1e6
A5 <- nor %>% subset(lake.area > 100) %>% pull(lake.area) %>% sum()*1e6 # km2 to m2

k5 <- with(subset(nor, lake.area > 100), mean(kCO2, na.rm = T)) 
E5 <- with(subset(nor, lake.area > 100), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F5 <- E5*A5/12.011 * 43.99 * 1e-12

k5_low <- with(subset(nor, lake.area > 100), mean(kCO2_low, na.rm = T)) 
E5_low <- with(subset(nor, lake.area > 100), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F5_low <- E5_low*A5/12.011 * 43.99  * 1e-12


bin_FCO2 <- data.frame(
            lake.size = c("A < 0.1","0.1 < A < 1","1 < A < 10","> 10", "Big lakes (> 100)","Total","All lakes"),
           pCO2 = c(p1,p2,p3,p4,p5,NA,p0), 
           kCO2_low = c(k1_low,k2_low,k3_low,k4_low, k5_low,NA,k0_low), 
           E_low = c(E1_low,E2_low,E3_low,E4_low,E5_low,NA, E0_low),
           FCO2_low = c(F1_low, F2_low, F3_low, F4_low, F5_low, sum(F1_low, F2_low, F3_low, F4_low, F5_low), F0_low),
           kCO2_high = c(k1,k2,k3,k4,k5, NA,k0),
           E_high = c(E1,E2,E3,E4, E5,NA, E0),
           FCO2_high = c(F1, F2, F3, F4, F5, sum(F1, F2, F3, F4, F5),F0))


kable(bin_FCO2, digits = 2, col.names = c("A = lake area in km2","mean pCO2, uatm","mean kCO2, m2/d","mean ECO2, g/m2/y", "binned FCO2, MtCO2/y","kCO2, m2/d","mean ECO2, g/m2/y", "binned FCO2, MtCO2/y")) %>% kable_styling() %>% 
  add_header_above(c("Lake class" = 1, " " = 1, "Low estimate" = 3, "High estimate" = 3)) %>%
  group_rows(group_label = "All lakes", start_row = 7, end_row = 7)
Lake class
Low estimate
High estimate
A = lake area in km2 mean pCO2, uatm mean kCO2, m2/d mean ECO2, g/m2/y binned FCO2, MtCO2/y kCO2, m2/d mean ECO2, g/m2/y binned FCO2, MtCO2/y
A < 0.1 696.27 0.41 43.15 0.06 0.61 64.37 0.08
0.1 < A < 1 659.69 0.42 38.41 0.21 0.70 63.63 0.35
1 < A < 10 592.30 0.36 29.86 0.72 0.70 57.11 1.38
> 10 551.21 0.34 25.84 1.12 0.74 56.17 2.43
Big lakes (> 100) 562.28 0.31 20.75 0.08 0.71 47.56 0.18
Total 2.18 4.42
All lakes
All lakes 656.99 0.40 38.11 2.82 0.67 62.28 4.61

Table 8: Total efflux of CO2 computed from the mean for each lake category, compared to efflux calculated for all lakes in Sweden

swe <- filter(nlss, nation == "Sweden")

inland.waters <- 4014169.92 * 1e4 # ha to m2, @SCB

p0 <- mean(swe$pCO2_toc, na.rm = T)*1e6
k0 <- mean(swe$kCO2, na.rm = T)
E0 <- mean(swe$ECO2_toc, na.rm = T)*365 # in gC/m2/year
F0 <- E0*inland.waters/12.011 * 43.99 * 1e-12 #MtCO2/t

k0_low <- mean(swe$kCO2_low, na.rm = T)
E0_low <- mean(swe$ECO2_toc_low, na.rm = T) *365 # in gC/m2/year
F0_low <- E0_low*inland.waters/12.011 * 43.99 * 1e-12 #MtCO2/t


A0 <- sum(subset(swe, lake.area < 100)$lake.area) # exclude big lakes

# Lake size < 0.1 km2

p1 <- swe %>% subset(lake.area < 0.1) %>% pull(pCO2_toc) %>% mean()*1e6
A1 <- swe %>% subset(lake.area < 0.1) %>% pull(lake.area) %>% sum()
A1_inland <- A1/A0*inland.waters

k1 <- with(subset(swe, lake.area < 0.1), mean(kCO2, na.rm = T)) 
E1 <- with(subset(swe, lake.area < 0.1), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F1 <- E1*A1_inland/12.011 * 43.99 * 1e-12 #MtCO2/t

  
k1_low <- with(subset(swe, lake.area < 0.1), mean(kCO2_low, na.rm = T)) 
E1_low <- with(subset(swe, lake.area < 0.1), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F1_low <- E1_low*A1_inland/12.011 * 43.99 * 1e-12

# 0.1 < Lake size < 1 km2

p2 <- swe %>% subset(lake.area < 1 & lake.area > 0.1) %>% pull(pCO2_toc) %>% mean()*1e6
A2 <- swe %>% subset(lake.area < 1 & lake.area > 0.1) %>% pull(lake.area) %>% sum()
A2_inland <- A2/A0*inland.waters

k2 <- with(subset(swe, lake.area < 1 & lake.area > 0.1), mean(kCO2, na.rm = T)) 
E2 <- with(subset(swe, lake.area < 1 & lake.area > 0.1), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F2 <- E2*A2_inland/12.011 * 43.99 * 1e-12


k2_low <- with(subset(swe, lake.area < 1 & lake.area > 0.1), mean(kCO2_low, na.rm = T)) 
E2_low <- with(subset(swe, lake.area < 1 & lake.area > 0.1), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F2_low <- E2_low*A2_inland/12.011 * 43.99 * 1e-12

# 1 < Lake size < 10 km2

p3 <- swe %>% subset(lake.area > 1 & lake.area < 10) %>% pull(pCO2_toc) %>% mean()*1e6
A3 <- swe %>% subset(lake.area > 1 & lake.area < 10) %>% pull(lake.area) %>% sum()
A3_inland <- A3/A0*inland.waters

k3 <- with(subset(swe, lake.area > 1 & lake.area < 10), mean(kCO2, na.rm = T)) 
E3 <- with(subset(swe, lake.area > 1 & lake.area < 10), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y

F3 <- E3*A3_inland/12.011 * 43.99 * 1e-12


k3_low <- with(subset(swe, lake.area > 1 & lake.area < 10), mean(kCO2_low, na.rm = T)) 
E3_low <- with(subset(swe, lake.area > 1 & lake.area < 10), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F3_low <- E3_low*A3_inland/12.011 * 43.99  * 1e-12

# 1 < Lake size < 10 km2

p4 <- swe %>% subset(lake.area > 10 & lake.area < 100) %>% pull(pCO2_toc) %>% mean()*1e6
A4 <- swe %>% subset(lake.area > 10 & lake.area < 100) %>% pull(lake.area) %>% sum()
A4_inland <- A4/A0*inland.waters

k4 <- with(subset(swe, lake.area > 10 & lake.area < 100), mean(kCO2, na.rm = T)) 
E4 <- with(subset(swe, lake.area > 10 & lake.area < 100), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F4 <- E4*A4_inland/12.011 * 43.99 * 1e-12

k4_low <- with(subset(swe, lake.area > 10 & lake.area < 100), mean(kCO2_low, na.rm = T)) 
E4_low <- with(subset(swe, lake.area > 10 & lake.area < 100), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F4_low <- E4_low*A4_inland/12.011 * 43.99  * 1e-12

# Big lakes
p5 <- swe %>% subset(lake.area > 100) %>% pull(pCO2_toc) %>% mean()*1e6
A5 <- swe %>% subset(lake.area > 100) %>% pull(lake.area) %>% sum()*1e6 # km2 to m2

k5 <- with(subset(swe, lake.area > 100), mean(kCO2, na.rm = T)) 
E5 <- with(subset(swe, lake.area > 100), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F5 <- E5*A5/12.011 * 43.99 * 1e-12

k5_low <- with(subset(swe, lake.area > 100), mean(kCO2_low, na.rm = T)) 
E5_low <- with(subset(swe, lake.area > 100), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F5_low <- E5_low*A5/12.011 * 43.99  * 1e-12


bin_FCO2 <- data.frame(
            lake.size = c("A < 0.1","0.1 < A < 1","1 < A < 10","> 10", "Big lakes (> 100)","Total","All lakes"),
           pCO2 = c(p1,p2,p3,p4,p5,NA,p0), 
           kCO2_low = c(k1_low,k2_low,k3_low,k4_low, k5_low,NA,k0_low), 
           E_low = c(E1_low,E2_low,E3_low,E4_low,E5_low,NA, E0_low),
           FCO2_low = c(F1_low, F2_low, F3_low, F4_low, F5_low, sum(F1_low, F2_low, F3_low, F4_low, F5_low), F0_low),
           kCO2_high = c(k1,k2,k3,k4,k5, NA,k0),
           E_high = c(E1,E2,E3,E4, E5,NA, E0),
           FCO2_high = c(F1, F2, F3, F4, F5, sum(F1, F2, F3, F4, F5),F0))


kable(bin_FCO2, digits = 2, col.names = c("A = lake area in km2","mean pCO2, uatm","mean kCO2, m2/d","mean ECO2, g/m2/y", "binned FCO2, MtCO2/y","kCO2, m2/d","mean ECO2, g/m2/y", "binned FCO2, MtCO2/y")) %>% kable_styling() %>% 
  add_header_above(c("Lake class" = 1, " " = 1, "Low estimate" = 3, "High estimate" = 3)) %>%
  group_rows(group_label = "All lakes", start_row = 7, end_row = 7)
Lake class
Low estimate
High estimate
A = lake area in km2 mean pCO2, uatm mean kCO2, m2/d mean ECO2, g/m2/y binned FCO2, MtCO2/y kCO2, m2/d mean ECO2, g/m2/y binned FCO2, MtCO2/y
A < 0.1 1080.00 0.68 161.40 0.28 0.90 205.49 0.36
0.1 < A < 1 1102.33 0.81 181.23 1.69 1.24 269.52 2.51
1 < A < 10 992.61 0.84 154.71 7.56 1.52 276.16 13.49
> 10 832.69 0.80 113.22 9.86 1.69 236.89 20.64
Big lakes (> 100) 721.26 0.73 81.91 3.50 1.85 206.87 8.85
Total 22.89 45.84
All lakes
All lakes 1054.94 0.76 163.47 24.03 1.18 241.88 35.56

Table 9: Total efflux of CO2 computed from the mean for each lake category, compared to efflux calculated for all lakes in Finland

fin <- filter(nlss, nation == "Finland")

inland.waters <- 34515 * 1e6 # km2 to m2, @Statfi

p0 <- mean(fin$pCO2_toc, na.rm = T)*1e6
k0 <- mean(fin$kCO2, na.rm = T)
E0 <- mean(fin$ECO2_toc, na.rm = T)*365 # in gC/m2/year
F0 <- E0*inland.waters/12.011 * 43.99 * 1e-12 #MtCO2/t

k0_low <- mean(fin$kCO2_low, na.rm = T)
E0_low <- mean(fin$ECO2_toc_low, na.rm = T) *365 # in gC/m2/year
F0_low <- E0_low*inland.waters/12.011 * 43.99 * 1e-12 #MtCO2/t


A0 <- sum(subset(fin, lake.area < 100)$lake.area) # exclude big lakes

# Lake size < 0.1 km2

p1 <- fin %>% subset(lake.area < 0.1) %>% pull(pCO2_toc) %>% mean()*1e6
A1 <- fin %>% subset(lake.area < 0.1) %>% pull(lake.area) %>% sum()
A1_inland <- A1/A0*inland.waters

k1 <- with(subset(fin, lake.area < 0.1), mean(kCO2, na.rm = T)) 
E1 <- with(subset(fin, lake.area < 0.1), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F1 <- E1*A1_inland/12.011 * 43.99 * 1e-12 #MtCO2/t

  
k1_low <- with(subset(fin, lake.area < 0.1), mean(kCO2_low, na.rm = T)) 
E1_low <- with(subset(fin, lake.area < 0.1), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F1_low <- E1_low*A1_inland/12.011 * 43.99 * 1e-12

# 0.1 < Lake size < 1 km2

p2 <- fin %>% subset(lake.area < 1 & lake.area > 0.1) %>% pull(pCO2_toc) %>% mean()*1e6
A2 <- fin %>% subset(lake.area < 1 & lake.area > 0.1) %>% pull(lake.area) %>% sum()
A2_inland <- A2/A0*inland.waters

k2 <- with(subset(fin, lake.area < 1 & lake.area > 0.1), mean(kCO2, na.rm = T)) 
E2 <- with(subset(fin, lake.area < 1 & lake.area > 0.1), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F2 <- E2*A2_inland/12.011 * 43.99 * 1e-12


k2_low <- with(subset(fin, lake.area < 1 & lake.area > 0.1), mean(kCO2_low, na.rm = T)) 
E2_low <- with(subset(fin, lake.area < 1 & lake.area > 0.1), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F2_low <- E2_low*A2_inland/12.011 * 43.99 * 1e-12

# 1 < Lake size < 10 km2

p3 <- fin %>% subset(lake.area > 1 & lake.area < 10) %>% pull(pCO2_toc) %>% mean()*1e6
A3 <- fin %>% subset(lake.area > 1 & lake.area < 10) %>% pull(lake.area) %>% sum()
A3_inland <- A3/A0*inland.waters

k3 <- with(subset(fin, lake.area > 1 & lake.area < 10), mean(kCO2, na.rm = T)) 
E3 <- with(subset(fin, lake.area > 1 & lake.area < 10), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y

F3 <- E3*A3_inland/12.011 * 43.99 * 1e-12


k3_low <- with(subset(fin, lake.area > 1 & lake.area < 10), mean(kCO2_low, na.rm = T)) 
E3_low <- with(subset(fin, lake.area > 1 & lake.area < 10), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F3_low <- E3_low*A3_inland/12.011 * 43.99  * 1e-12

# 1 < Lake size < 10 km2

p4 <- fin %>% subset(lake.area > 10 & lake.area < 100) %>% pull(pCO2_toc) %>% mean()*1e6
A4 <- fin %>% subset(lake.area > 10 & lake.area < 100) %>% pull(lake.area) %>% sum()
A4_inland <- A4/A0*inland.waters

k4 <- with(subset(fin, lake.area > 10 & lake.area < 100), mean(kCO2, na.rm = T)) 
E4 <- with(subset(fin, lake.area > 10 & lake.area < 100), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F4 <- E4*A4_inland/12.011 * 43.99 * 1e-12

k4_low <- with(subset(fin, lake.area > 10 & lake.area < 100), mean(kCO2_low, na.rm = T)) 
E4_low <- with(subset(fin, lake.area > 10 & lake.area < 100), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F4_low <- E4_low*A4_inland/12.011 * 43.99  * 1e-12

# Big lakes
p5 <- fin %>% subset(lake.area > 100) %>% pull(pCO2_toc) %>% mean()*1e6
A5 <- fin %>% subset(lake.area > 100) %>% pull(lake.area) %>% sum()*1e6 # km2 to m2

k5 <- with(subset(fin, lake.area > 100), mean(kCO2, na.rm = T)) 
E5 <- with(subset(fin, lake.area > 100), mean(ECO2_toc*365, na.rm = T)) #gC/m2/y
F5 <- E5*A5/12.011 * 43.99 * 1e-12

k5_low <- with(subset(fin, lake.area > 100), mean(kCO2_low, na.rm = T)) 
E5_low <- with(subset(fin, lake.area > 100), mean(ECO2_toc_low*365, na.rm = T)) #gC/m2/y
F5_low <- E5_low*A5/12.011 * 43.99  * 1e-12


bin_FCO2 <- data.frame(
            lake.size = c("A < 0.1","0.1 < A < 1","1 < A < 10","> 10", "Big lakes (> 100)","Total","All lakes"),
           pCO2 = c(p1,p2,p3,p4,p5,NA,p0), 
           kCO2_low = c(k1_low,k2_low,k3_low,k4_low, k5_low,NA,k0_low), 
           E_low = c(E1_low,E2_low,E3_low,E4_low,E5_low,NA, E0_low),
           FCO2_low = c(F1_low, F2_low, F3_low, F4_low, F5_low, sum(F1_low, F2_low, F3_low, F4_low, F5_low), F0_low),
           kCO2_high = c(k1,k2,k3,k4,k5, NA,k0),
           E_high = c(E1,E2,E3,E4, E5,NA, E0),
           FCO2_high = c(F1, F2, F3, F4, F5, sum(F1, F2, F3, F4, F5),F0))


kable(bin_FCO2, digits = 2, col.names = c("A = lake area in km2","mean pCO2, uatm","mean kCO2, m2/d","mean ECO2, g/m2/y", "binned FCO2, MtCO2/y","kCO2, m2/d","mean ECO2, g/m2/y", "binned FCO2, MtCO2/y")) %>% kable_styling() %>% 
  add_header_above(c("Lake class" = 1, " " = 1, "Low estimate" = 3, "High estimate" = 3)) %>%
  group_rows(group_label = "All lakes", start_row = 7, end_row = 7)
Lake class
Low estimate
High estimate
A = lake area in km2 mean pCO2, uatm mean kCO2, m2/d mean ECO2, g/m2/y binned FCO2, MtCO2/y kCO2, m2/d mean ECO2, g/m2/y binned FCO2, MtCO2/y
A < 0.1 1266.42 0.72 184.36 0.17 0.95 239.76 0.22
0.1 < A < 1 1160.03 0.73 157.06 0.72 1.10 239.27 1.09
1 < A < 10 1072.12 0.78 145.35 4.32 1.41 262.71 7.81
> 10 1077.45 0.78 138.04 12.59 1.66 298.41 27.22
Big lakes (> 100) 963.51 0.81 121.63 6.57 2.03 304.46 16.45
Total 24.37 52.79
All lakes
All lakes 1158.94 0.75 160.23 20.25 1.20 251.88 31.84

4.7 Compare with Raymond et al. (2013)

Raymond et al. (2013) estimate the CO2 emissions as gC/m2 land/year. We convert our estimated median CO2 evasion rate from gC/m2/year to gC/m2 land/year by multiplying it by the water area in a given region, and then dividing it by the land area.

Table 10: Evasion rate of CO2 per unit of area (gC/m2/year) and per unit of land area (gC/m2 land/year), low and high estimates

inland.waters.nor <- 20221 * 1e6 # km2 to m2, @SSB
land.nor <- land <- 323806 * 1e6 # km2 to m2, @SSB
inland.waters.swe <- 4014169.92 * 1e4 # ha to m2, @SCB
land.swe <- 444435 * 1e6 # km2 to m2
inland.waters.fin <- 34515 * 1e6 # km2 to m2, @Statfi
land.fin <- 338478 * 1e6 # km2 to m2

tot.water <- inland.waters.nor + inland.waters.swe + inland.waters.fin
tot.land <- land.nor + land.swe + land.fin

mean.eco2.high <- c(median(nlss$ECO2_toc, na.rm = T)*365, #NLSS # in gC/m2/year
                    with(subset(nlss, nation == "Norway"), median(ECO2_toc, na.rm = T)*365),
                    with(subset(nlss, nation == "Sweden"), median(ECO2_toc, na.rm = T)*365),
                    with(subset(nlss, nation == "Finland"), median(ECO2_toc, na.rm = T)*365))

mean.eco2.low <- c(median(nlss$ECO2_toc_low, na.rm = T)*365, #NLSS # in gC/m2/year
                    with(subset(nlss, nation == "Norway"), median(ECO2_toc_low, na.rm = T)*365),
                    with(subset(nlss, nation == "Sweden"), median(ECO2_toc_low, na.rm = T)*365),
                    with(subset(nlss, nation == "Finland"), median(ECO2_toc_low, na.rm = T)*365))                    
                    
land.eco2.high <- c(mean.eco2.high[1]*tot.water/tot.land,
                    mean.eco2.high[2]*inland.waters.nor/land.nor,
                    mean.eco2.high[3]*inland.waters.swe/land.swe,
                    mean.eco2.high[4]*inland.waters.fin/land.fin)

land.eco2.low <- c(mean.eco2.low[1]*tot.water/tot.land,
                    mean.eco2.low[2]*inland.waters.nor/land.nor,
                    mean.eco2.low[3]*inland.waters.swe/land.swe,
                    mean.eco2.low[4]*inland.waters.fin/land.fin)

kable(tibble("Country" = c("All 3", "Norway", "Sweden", "Finland"),
            "Water"= c(tot.water, inland.waters.nor, inland.waters.swe, inland.waters.fin),
            "Land" = c(tot.land, land.nor, land.swe, land.fin),
            mean.eco2.high,
            land.eco2.high,
            mean.eco2.low,
            land.eco2.low),
      col.names = c("Country","Water area, km2", "Land area, km2", "Median ECO2, gC/m2/year", "Median ECO2, gC/m2 land/year", "Median ECO2, gC/m2/year", "Median ECO2, gC/m2 land/year"),
      digits = 2) %>% kable_styling() %>%
  add_header_above(c(" ","Area" = 2, "High estimate" = 2, "Low estimate" = 2))
Area
High estimate
Low estimate
Country Water area, km2 Land area, km2 Median ECO2, gC/m2/year Median ECO2, gC/m2 land/year Median ECO2, gC/m2/year Median ECO2, gC/m2 land/year
All 3 94877699200 1.106719e+12 175.38 15.04 106.94 9.17
Norway 20221000000 3.238060e+11 41.15 2.57 24.08 1.50
Sweden 40141699200 4.444350e+11 209.71 18.94 131.65 11.89
Finland 34515000000 3.384780e+11 235.52 24.02 142.98 14.58
knitr::knit_exit()
Appelo, C. A. J., and D. Postma. 2005. GEOCHEMISTRY, GROUNDWATER AND POLLUTION 2nd edition. Vol. 59. 703.
Cai, Wei-Jun, Yongchen Wang, and Robert Hodson. 1998. Acid-base proporties of dissolved organic matter in the estuarine waters of Georgia, USA.” Geochimica Et Cosmochimica Acta 62 (3): 473–83. https://doi.org/https://doi.org/10.1016/S0016-7037(97)00363-3.
Cole, Jonathan J., and Nina F. Caraco. 1998. Atmospheric exchange of carbon dioxide in a low-wind oligotrophic lake measured by the addition of SF6.” Limnology and Oceanography 43 (4): 647–56. https://doi.org/10.4319/lo.1998.43.4.0647.
Couturier, M., Y. T. Prairie, A. M. Paterson, E. J. S. Emilson, and P. A. del Giorgio. 2022. Long-Term Trends in pCO2 in Lake Surface Water Following Rebrowning.” Geophysical Research Letters 49 (11): 1–9. https://doi.org/10.1029/2022GL097973.
Crusius, John, and Rik Wanninkhof. 2003. Gas transfer velocities measured at low wind speed over a lake.” Limnology and Oceanography 48 (3): 1010–17. https://doi.org/10.4319/lo.2003.48.3.1010.
Harned, Herbert S., and Raymond Davis. 1943. The Ionization Constant of Carbonic Acid in Water and the Solubility of Carbon Dioxide in Water and Aqueous Salt Solutions from 0 to 50°.” Journal of the American Chemical Society 65 (10): 2030–37. https://doi.org/10.1021/ja01250a059.
Harned, Herbert S, and Samuel R Scholes. 1941. The ionization Constant of HCO3- from 0 to 50°.” Journal of American Chemical Society 63 (6): 1706–9. https://pubs.acs.org/sharingguidelines.
Hastie, Adam, Ronny Lauerwald, Gesa Weyhenmeyer, Sebastian Sobek, Charles Verpoorter, and Pierre Regnier. 2018. CO2 evasion from boreal lakes: Revised estimate, drivers of spatial variability, and future projections.” Global Change Biology 24 (2): 711–28. https://doi.org/10.1111/gcb.13902.
Hruška, Jakub, Stephan Köhler, Hjalmar Laudon, and Kevin Bishop. 2003. Is a universal model of organic acidity possible: Comparison of the acid/base properties of dissolved organic carbon in the boreal and temperate zones.” Environmental Science and Technology 37 (9): 1726–30. https://doi.org/10.1021/es0201552.
Liu, Shaoda, David E. Butman, and Peter A. Raymond. 2020. Evaluating CO2 calculation error from organic alkalinity and pH measurement error in low ionic strength freshwaters.” Limnology and Oceanography: Methods 18 (10): 606–22. https://doi.org/10.1002/lom3.10388.
Lozovik, P. A. 2005. Contribution of organic acid anions to the alkalinity of natural humic water.” Journal of Analytical Chemistry 60 (11): 1000–1004. https://doi.org/10.1007/s10809-005-0226-3.
Raymond, Peter A., Jens Hartmann, Ronny Lauerwald, Sebastian Sobek, Cory McDonald, Mark Hoover, David Butman, et al. 2013. Global carbon dioxide emissions from inland waters.” Nature 503 (7476): 355–59. https://doi.org/10.1038/nature12760.
Vachon, Dominic, and Yves T. Prairie. 2013. The ecosystem size and shape dependence of gas transfer velocity versus wind speed relationships in lakes.” Canadian Journal of Fisheries and Aquatic Sciences 70 (12): 1757–64. https://doi.org/10.1139/cjfas-2013-0241.
Wanninkhof, Rik. 2014. Relationship between wind speed and gas exchange over the ocean revisited.” Limnology and Oceanography: Methods 12 (JUN): 351–62. https://doi.org/10.4319/lom.2014.12.351.
Weiss, R. F. 1974. Carbon dioxide in water and seawater: the solubility of a non-ideal gas.” Marine Chemistry 2: 203–15. https://doi.org/10.5194/bg-13-841-2016.